Security Grid

50 IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, VOL. 7, NO.
1, JANUARY-MARCH 2010
Secure Data Objects Replication in Data Grid

Manghui Tu, Member, IEEE, Peng Li, I-Ling Yen, Member, IEEE,
Bhavani Thuraisingham, Fellow, IEEE, and Latifur Khan, Member, IEEE
Abstract—Secret sharing and erasure coding-based approaches have been used in distributed storage systems to ensure the
confidentiality, integrity, and availability of critical information. To achieve performance goals in data accesses, these data
fragmentation approaches can be combined with dynamic replication. In this paper, we consider data partitioning (both secret sharing
and erasure coding) and dynamic replication in data grids, in which security and data access performance are critical issues. More
specifically, we investigate the problem of optimal allocation of sensitive data objects that are partitioned by using secret sharing
scheme or erasure coding scheme and/or replicated. The grid topology we consider consists of two layers. In the upper layer, multiple
clusters form a network topology that can be represented by a general graph. The topology within each cluster is represented by a tree
graph. We decompose the share replica allocation problem into two subproblems: the Optimal Intercluster Resident Set Problem
(OIRSP) that determines which clusters need share replicas and the Optimal Intracluster Share Allocation Problem (OISAP) that
determines the number of share replicas needed in a cluster and their placements. We develop two heuristic algorithms for the two
subproblems. Experimental studies show that the heuristic algorithms achieve good performance in reducing communication cost and
are close to optimal solutions.
Index Terms—Secure data, secret sharing, erasure coding, replication, data grids.
1 INTRODUCTION
D ATA grid is a distributed computing architecture that

integrates a large number of data and computing
resources into a single virtual data management system [2].
locations of dangerous chemicals and hazard containment
devices, can help draw relatively safe and effective rescue
plans. Delayed accesses to these data can endanger the
It enables the sharing and coordinated use of data from responders as well as increase the risk to the victims or
various resources and provides various services to fit the cause severe damages to the property. At the same time, the
needs of high-performance distributed and data-intensive information such as location of hazardous chemicals is
computing. Many data grid applications are being devel- highly sensitive and, if falls in the hands of terrorists, could
oped or proposed, such as DoD’s Global Information Grid cause severe consequences. Thus, confidentiality of the
(GIG) for both business and military domains [8], NASA’s critical information should be carefully protected. The
Information Power Grid [22], GMESS Health-Grid for above example indicates the importance of data grids and
medical services [7], data grids for Federal Disaster Relief their availability, reliability, accuracy, and responsiveness.
[33], etc. These data grid applications are designed to Replication is frequently used to achieve access effi-
support global collaborations that may involve large ciency, availability, and information survivability. The
amount of information, intensive computation, real time, underlying infrastructure for data grids can generally be
or nonreal time communication. Success of these projects classified into two types: cluster based and peer-to-peer
can help to achieve significant advances in business, systems [6], [19]. In pure peer-to-peer storage systems, there
is no dedicated node for grid applications (in some systems,
medical treatment, disaster relief, research, and military
some servers are dedicated). Replication can bring data
and can result in dramatic benefits to the society.
objects to the peers that are close to the accessing clients
There are several important requirements for data grids,
and, hence, improve access efficiency. Having multiple
including information survivability, security, and access
replicas directly implies higher information survivability. In
performance [2], [3], [21]. For example, consider a first
cluster-based systems, dedicated servers are clustered
responder team responding to a fire in a building with
together to offer storage and services. However, the number
explosive chemicals [5]. The data grid that hosts building
of clusters is generally limited and, thus, they may be far
safety information, such as the building layout and
from most clients. To improve both access performance and
availability, it is necessary to replicate data and place them
. M. Tu is with the Department of Computer Science and Information close to the clients, such as peer-to-peer data caching. As
Systems, Southern Utah University, 351 West University Blvd., Cedar can be seen, replication is an effective technique for all types
City, UT 84720. E-mail: tumh2000@yahoo.com.
of data grids. Existing research works on replication in data
. P. Li, I. Yen, B. Thuraisingham, and L. Khan are with the Department of
Computer Science, University of Texas at Dallas, 800 West Campbell grids investigate replica access protocols [3], [29], resource
Road, MS EC31, Richardson, TX 75080-3021. management and discovery techniques [27], [30], replica
E-mail: {ilyen, bhavani.thuraisingham, lkhan}@utdallas.edu. location and discovery algorithms [3], [9], [29], and replica
Manuscript received 13 Aug. 2006; revised 10 Sept. 2007; accepted 7 Jan. placement issues [25].
2008; published online 12 Mar. 2008. Though replication can greatly help with information
For information on obtaining reprints of this article, please send e-mail to:
tdsc@computer.org, and reference IEEECS Log Number TDSC-0021-0206. survivability and access efficiency, it does not address
Digital Object Identifier no. 10.1109/TDSC.2008.19. security requirements. Having more replicas implies a
1545-5971/10/$26.00 ß 2010 IEEE Published by the IEEE Computer Society
TU ET AL.: SECURE DATA OBJECTS REPLICATION IN DATA GRID 51
higher potential of compromising one of the replicas [37],

[38]. One solution is to encrypt the sensitive data and their
replicas. However, it pushes the responsibility of protecting
data to protecting encryption keys and brings a nontrivial
key management problem [17], [38]. If a simple centralized
key server system is used, then it is vulnerable to single
point of failure and denial of service attacks. Also, the
centralized key server may be compromised and, hence,
reveal all keys. Replication of keys can increase its access
efficiency as well as avoiding the single-point failure
problem and reducing the risk of denial of service attacks, Fig. 1. A sample graph of our system topology.
but would increase the risk of having some compromised
key servers. If one of the key servers is compromised, all the data set is considered). The placement problem for
critical data are essentially compromised. Beside key partitioned data is more complex since the replicas of the
management issues, information leakage is another pro- data partitions need to be considered together. Moreover,
blem with the replica encryption approach [14], [15], [24]. client access patterns for partitioned data are more
Generally, a key is used to access many data objects. When a complicated. Thus, it is necessary to investigate the schemes
client leaves the system or its privilege for some accesses is for allocating partitioned data.
revoked, those data objects have to be reencrypted using a The research on replica placement of partitioned data is
new key and the new key has to be distributed to other limited. In [20], the authors attempt to measure the security
clients. If one of the data storage servers is compromised, assurance probabilities and use it to guide allocation. They
the storage server could retain a copy of the data encrypted consider data objects that are secret shared but no
using the old key. Thus, the content of long-lived data may replication is considered. Also, the share allocation algo-
leak over time. Therefore, additional security mechanisms rithm they propose does not consider performance issues
are needed for sensitive data protection. such as communication cost and response latency. The
There are other methods proposed for providing work in [17] considers a secure data storage system that
survivable and secure storage in untrustworthy environ- survives even if some nodes in the system are compro-
ment. The most effective approach is to provide intrusion mised. It assumes that data are secret shared and the full set
tolerance [4], [18], [26], [31], [36]. Most intrusion-tolerant of shares are replicated and statically distributed over the
systems partition the sensitive data and distribute the network. The major focus of this work is to guarantee
shares across the storage sites to achieve both confidenti- confidentiality and integrity requirements of the storage
ality and survivability [4], [16], [17], [20], [24], [28], [36], [37]. system. The communication cost and response latency are
By doing so, a data object can remain secure even if a partial not considered. Also, it does not address how to allocate the
set of its shares (below a threshold) are compromised by the share replicas.
adversaries. This scheme can be used to protect critical data In this paper, we consider combining data partitioning
directly or to protect the keys of the encrypted data. and replication to support secure, survivable, and high-
The intrusion tolerance concept and data partitioning performance storage systems. Our goal is to develop
techniques can be used to achieve data survivability as well placement algorithms to allocate share replicas such that
as security. The most commonly used schemes for data the communication cost and access latency are minimized.
partitioning include secret sharing [28] and erasure coding The remainder of this paper is organized as follows:
[24]. Both schemes partition data into shares and distribute Section 2 describes a data grid system model and the
them to different processors to achieve availability and problem definitions. Section 3 introduces a heuristic
integrity [17], [20], [37]. Secret sharing schemes assure algorithm for determining the clusters that should host
confidentiality even if some shares (less than a threshold) shares. A heuristic algorithm for share allocation within a
are compromised. In erasure coding, data shares can be cluster is presented in Section 4. In Section 5, the results of
encrypted and the encryption key can be secret shared and the experimental studies are discussed. Section 6 discusses
distributed with the data shares to assure confidentiality some research works that are related to this research.
[14], [15], [24]. However, changing the number of shares in a Section 7 states the conclusion of this paper.
data partitioning scheme is generally costly. When it is
necessary to add additional shares close to a group of
clients to reduce the communication cost and access latency,
2 SYSTEM MODEL AND PROBLEM SPECIFICATION
it is easier to add share replicas. Thus, it is most effective to In this paper, we consider achieving secure, survivable, and
combine the data partitioning and replication techniques for high-performance data storage in data grids. To facilitate
high-performance secure storage design. scalability, we model the peer-to-peer data grid as a two-
To actually achieve improved performance, it is essential level topology (shown in Fig. 1). Studies show that the
to place the replicated data partitions in strategic locations internet can be decomposed into interconnected autono-
to maximize the gain. There are extensive works in replica mous systems [11], [23]. One or several such autonomous
placement [1], [13], [34], [35], [12]. However, existing systems that are geographically close to each other can be
placement algorithms focus on the placement of indepen- considered as a cluster. The system consists of M clusters,
dent data objects (generally, only a single data or a single H1 ; . . . ; HM , which are linked together and form a general
52 IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, VOL. 7, NO. 1, JANUARY-MARCH 2010
graph topology GC ¼ ðH C ; E C Þ. Here, H C ¼ fH1 ; . . . ; HM g incur an m=ðm kÞ fold of storage waste (the rest are for
and E C is the set of edges connecting the clusters. Each edge improving availability and performance). If the storage
represents a logical link which may be multiple hops of the space is a concern, then erasure coding schemes can be
physical links. It is likely that the clusters are linked to the used. An erasure coding scheme uses the same mathe-
backbone and should be modeled by a general graph. matics except that the k 1 coefficients of the polynomial
Within each cluster, there may be many subnets from the are dependent to d. Thus, partial information may be
same or multiple institutions. Among all the physical nodes inferred with fewer than k shares, and hence, encryption
in the cluster, some nodes, such as servers, proxies, and is needed for confidentiality assurance. Generally, the
other individual nodes, may be committed to contribute its encryption keys are secret shared and distributed with the
storage and/or computation resources for some data grid data. Erasure coding schemes achieve best storage
applications. These nodes are connected via logical links. efficiency, even when compared with replication [14],
According to [23], internet message routing is relatively [15], [16], [24]. The access performance in secret sharing,
stable in days or even weeks and the multiple routing paths erasure coding, and replication (with secret shared keys)
generally form a tree. Thus, for simplicity, we model the schemes are approximately the same. Here, we do not
topology inside a cluster as a tree. Consider a cluster Hx . Let limit the data partitioning schemes (as long as it is
Gx ¼ ðVx ; Ex Þ represents the topology graph within the secure). To ensure that the secret data can be recon-
cluster, where Vx ¼ fPx;1 ; Px;2 ; . . . ; Px;Nx g denotes the set of structed even when k 1 nodes are compromised, we
Nx (N if only considering cluster Hx ) nodes in cluster Hx , require m 2k 1. Let l denote the number of distinct
and Ex is the set of edges connecting nodes in Hx . Also, let shares to be accessed for each read request (it is fixed for
Pxroot denote the root node in Hx (e.g., Pxroot 2 Vx ). We all read requests). We have k l m. If l > k, the original
assume that all traffic in Hx goes through the network data can be reconstructed and validity of the shares can
where Pxroot resides. Let ðPx;i ; Px;j Þ denote the shortest path be checked. The parameter l can be determined for each
between Px;i and Px;j in Hx , and jðPx;i ; Px;j Þj denote the specific application system depending on its needs.
distance of ðPx;i ; Px;j Þ. Also, let ðHx ; Hy Þ denote the In many applications, data could be read as well as
shortest path between Hx and Hy (actually, between Pxroot updated by clients from geographically distributed areas.
and Pyroot ), and jðHx ; Hy Þj denote the distance of ðHx ; Hy Þ. For example, in a major rescue mission or a disaster relief act,
We assume that jðPx;i ; Px;j Þj for any i, j, and x is much less the problem areas and resources need to be updated in real
than jðHy ; Hz Þj for any y and z, where y 6¼ z (i.e., the time to facilitate dynamic and effective planning. Net-centric
distance between any two nodes within a cluster is less than command and control systems rely on GIG for its dynamic
the distance between any two clusters). information flow, and frequently, critical data need to be
The data grid (represented by the set of clusters H C ) updated on-the-fly to support agility. Also, updates in
storage systems for encryption keys can be quite frequent
hosts a set of data objects D (D can contain the application
due to the changes in membership and access privileges of
data or keys). One of the clusters is selected as the Master
the individuals. In our model, for each update request, all
Server Cluster (MSC) for some data objects in D, denoted as
shares and share replicas need to be updated using a primary
HMSC (different data objects may have different HMSC ).
lazy update protocol as that discussed in [10] and [16].
HMSC hosts these data objects permanently (it may be the
Generally, eager update is not feasible in widely distributed
original data source). These data objects may be partially
systems since it takes too long to finish the updates. Also, the
replicated in other clusters in H C to achieve better access
large-scale network may be partitioned and some clusters
performance. Due to the increasing attacks on Internet, a
may be temporarily unreachable. Thus, a lazy update is more
node hosting some data objects in D has a significant chance
suitable. Furthermore, a primary copy is frequently used to
of being compromised. If a node is compromised, all the
avoid system delusion when the system size is large or the
plaintext data objects stored on it are compromised. If a
update ratio is high [10]. Based on the primary lazy update
storage node storing some encrypted data is compromised
protocol, all update requests are first forwarded to HMSC for
and the nodes maintaining the corresponding encryption execution and the updates are then propagated to other
keys are also compromised (note that they may be the same clusters along a minimum spanning tree (MST) as described
nodes), then the data are compromised. We assume that the in [34]. Consistency can also be maintained periodically
probabilities of compromising two different storage nodes using a distributed vector clock [16], [17] without concerning
are not correlated. This is true for many new attacks, such node failures or network partitioning. Moreover, various
as viruses that are spread through emails and the major update execution protocols can be chosen flexibly to further
buffer overflow attacks. achieve Byzantine fault tolerance [16] and/or high security
To cope with potential threats, the data partitioning assurance [17].
technique is used. Each data object d ðd 2 DÞ is partitioned More details of the read and update protocols and their
into m shares. Major data partitioning schemes include costs will be discussed in the next section. In this paper, we
secret sharing [17], [20], [28], [37] and erasure coding [14], assume secure communication channels for the delivery of
[15], [16], [24], [37]. In an ðm; kÞ secret sharing scheme, data shares. Standard encryption algorithms, such as SSL,
m shares are computed from a data object d using a can be used to achieve this.
polynomial with k 1 randomly chosen coefficients and
distributed over m servers. d can be determined uniquely 2.1 Access Model and Problem Decomposition
with any k shares and no information about d can be Data placement decisions are made based on historical client
inferred with fewer than k shares. Secret sharing schemes access patterns. We model access patterns by analyzing the
number of read/write accesses from each node or each (encrypted using the session key between Hy and the
cluster. Consider a data object d. Let T denote the time client); 2) Pyroot puts the pieces into one message and sends
period unit for collecting information of access patterns. Let it to Pxroot ; and 3) Pxroot sends the shares to the requesting
Ar ðPx;i Þ and Aw ðPx;i Þ denote the numbers of read and write node Px;i and Px;i forwards them to client C. Overall, the
accesses, respectively, initiated from node Px;i over time T . read cost is jðPx;i ; Pxroot Þj þ jðHx ; RC Þj þ jðPyroot ; Ry ; lÞj.
Also, let Ar ðHx Þ and Aw ðHx Þ denote the numbers of read and Note that jðPx;i ; Rx ; lÞj ¼ 0 if Hx hosts less than l shares,
write accesses, respectively,
P initiated from a Pcluster Hx over and jðPx;i ; Pxroot Þj þ jðHx ; RC Þj þ jðPyroot ; Ry ; lÞj ¼ 0, other-
time T . Ar ðHx Þ ¼ i Ar ðPx;i Þ and Aw ðHx Þ ¼ i Aw ðPx;i Þ. Let wise. Let readCost denote the total read cost in the system.
wC denoteP the total number of update requests on d in GC , We have
i.e., wC ¼ Hx Aw ðHx Þ.
Based on the client access patterns, subsets of the readCost
m shares of each data in D may be replicated to the clusters 8
> ðPx;i ; Rx ; lÞ; if Hx holds
in H C . The set of clusters that hold shares is defined as the >
>
>
cluster level resident set. Let RC denote the cluster level X X> <

at least l shares;
resident set, i.e., RC ¼ fHx j clusters hold sharesg. To mini- ¼ ðPx;i ; P root Þ þ ðHx ; RC Þ
>
> x
mize the communication cost, we consider that a cluster
Hx i >
>
>
:
þðPyroot ; Ry ; lÞ; otherwise:
holds either none or at least l distinct share replicas (will be
proven in Theorem 3.1). Also, we assume that each cluster
holds only distinct shares (i.e., at most m) since, otherwise, As discussed earlier, the update access protocol is primary
extra efforts are required to avoid reading duplicated shares lazy update. Inside a cluster Hx , updates are always
(a large m value ensure that a sufficient distinct shares can propagated from the root node Pxroot to all other nodes
be allocated to each cluster). Intracluster residence set is the holding share replica along an MST. The update cost for a
set of nodes that holds a share replica within a cluster. Let client at node Px;j updating P shares is jðPx;i ; Pxroot Þj þ
C C root
Rx denote the intracluster residence set of Hx , i.e., Rx ¼ jðHx ; HMSC Þj þ j ðR Þj þ Hy jðPyP ; RP y ; jRy jÞj, w h e r e
fPx;i j Px;i holds a share and Px;i is in Hx g. Correspondingly, Hy 2 RC . Let forwardCost denote Hx
root
i jðPx;i ; Px Þj þ
jRx j denotes the number of share replicas in Hx . We say Rx jðHx ; HMSC Þj.
(or RC ) is connected if and only if every node in Rx (or RC ) Let updateCost denote the overall update cost in the
has at least one path to any other node in Rx (or RC ), and system. We have
each node on the path also belongs to Rx (or RC ). X root
Otherwise, Rx (or RC ) is partitioned. In this case, Rx (or updateCost ¼ wC C ðRC Þ þ P ; Rx ; jRx j
Hx x
RC ) contains multiple subgraphs Rx;1 ; Rx;2 ; . . . ; Rx;n (or þ forwardCost:
RC1 ; RC2 ; . . . ; RCn ), n > 1, Rx;i (or RCi ), 1 i n, is a con-
nected subgraph, and Rx;i and Rx;j (or RCi and RCj Þ, i 6¼ j, Note that we do not consider the extra cost required
are not connected. for the detection and recovery of an invalid share. With
Now, consider the intracluster level. Let ðPx;i ; Rx Þ both update and read cost, the total cost becomes
denote the shortest path from Px;i to any node in Rx , and Tcost ¼ updateCost þ readCost. Table 1 gives a summary
ðPx;i ; Pxroot Þ denote the shortest path from Px;i to the root of the notation used in this paper.
node in Hx . Also, let ðHx ; RC Þ denote the shortest path from Our goal is to replicate the data shares and allocate
Hx to the closest cluster in RC , and jðHx ; RC Þj is the distance them to different nodes in the data grid to minimize
of path ðHx ; RC Þ (only counting the cluster level cost). Let Tcost. We decompose the allocation problem into two
ðPx;i ; Rx ; Þ denote the MST rooted at Px;i and includes a subproblems—intracluster and intercluster share alloca-
total of nodes hosting shares in cluster Hx . jðPx;i ; Rx ; Þj tion problems—and deal with them separately and
represents the total distance of the MST. Let C ðRC Þ denote independently. First, consider the updateCost. All updates
the MST from HMSC to all clusters in RC at the cluster level. need to be sent to a cluster that holds share replicas.
jC ðRC Þj is the total distance of the MST C ðRC Þ, but only Assume that Hx holds share replicas. Pxroot has the
considering the costs at the cluster level (i.e., the distance to knowledge of the total update access number wC (note
the root node of each involved cluster). that every update needs to be propagated to Hx ) and the
The read access protocol tries to read the closest l share topology within the cluster. The forwardCost is a constant,
replicas. Consider a client C sending a read request to Px;i . because it has no impact on the choice of resident set.
If the local cluster of Px;i (i.e., Hx ) holds shares (note that Thus, the local allocation within a cluster Hx is not
Hx holds either none or at least l share replicas), then it impacted by the allocation in other clusters (besides
reads l shares within Hx (the l nodes are selected such that needing to know wC and to have an intercluster level
the communication cost is minimal). The access cost in this algorithm to determine whether Hx should hold share
case is jðPx;i ; Rx ; lÞj (assume that the communication cost replicas). Now, consider the readCost. If the local cluster
between the client and Px;i is negligible). If Hx does not Hx holds the P share P replicas, then the allocation for
r
hold share replicas, then Px;i obtains all l shares from the minimizing Hx i ðjðPx;i ; Rx ; lÞj A ðPx;i Þ within Hx can
closest cluster Hy where Hy 2 RC . The algorithm for be computed locally. Otherwise, Hx does not hold shares
transferring shares from the source cluster Hy to the and needs to read from a remote cluster Hy . No matter
requesting cluster Hx is defined as follows: 1) all the nodes how Ry is decided, jðPx;i ; Pxroot Þj þ jðHx ; RC Þj is the same;
in Hy holding the desired shares send the shares to Pyroot thus, jðPx;i ; Pxroot Þj þ jðHx ; RC Þj Ar ðHx Þ has no effect on
TABLE 1
Summary of the Frequently Used Notation
the decision of Ry . Now, the effect of Ar ðHx Þ on the Let UpdateCostC ðGC ; RC Þ denote the total update cost in
decision of Ry can be computed as Ar ðHx Þ jðPyroot ; Ry ; lÞj, G with the resident set RC , then
C
which can be measured locally in Hy since Pyroot has the

knowledge of Ar ðHx Þ. Thus, Hy can make share replica UpdateCostC ðGC ; RC Þ ¼ wC C ðRC Þ:
placement decisions locally and independently, by con-
sidering Ar ðHx Þ as requests from Pyroot . Thus, the total access cost in GC , denoted as
Based on the analysis above, we can decompose the CostðGC ; RC Þ is defined as follows:
partitioned data replica problem into two subproblems.
The first problem is to decide which cluster should keep CostC ðGC ; RC Þ ¼ UpdateCostC ðGC ; RC Þ
the share replicas at the cluster level, and we define it þ ReadCostC ðGC ; RC Þ:
as the Optimal Intercluster Resident Set Problem (OIRSP). The
second problem is to decide how many share replicas are The problem here is to decide the share replica resident set
needed for each cluster and how to allocate them within the RC in GC , such that the communication cost CostC ðGC ; RC Þ is
cluster. We define it as the Optimal Intracluster Share minimized. The optimal resident set problem in general
Allocation Problem (OISAP). Each cluster is viewed as a graph is proven to be NP-complete [34]. In Section 4, we
single node in OIRSP. In the next two sections, we specify propose a heuristic algorithm to compute a near-optimal
the two subproblems in details. resident set for a general graph. We will show that the
heuristic algorithm has an OðM 3 Þ complexity, where M is the
2.2 OIRSP Specification number of clusters in the system. We will also prove that our
We define the first problem, OIRSP, as the optimal resident heuristic algorithm is optimal in a tree network, with time
set problem in a general graph (intercluster level graph) complexity OðMÞ.
with an MSC HMSC . Our goal is to determine the optimal
2.3 OISAP Specification
RC that yields minimum access cost at the cluster level.
For a cluster Hx 2 RC with jRx j l, all read request from When we consider allocation problem within a cluster Hx ,
Hx are served locally and the cost is 0 at the cluster level. we can isolate the cluster and consider the problem
For a cluster Hx with jRx j < l, it always transmits all read independently. As discussed earlier, all read requests from
access requests in Hx to the closest cluster Hy 2 RC to remote clusters can be viewed as read requests from the
access l distinct shares, with jRy j l. The read cost of root node. Also, the wC updates in the entire system can be
cluster at the cluster level is Ar ðHx Þ jðHx ; RC Þj. Let considered as updates done at the root node of the cluster.
ReadCostC ðGC ; RC Þ denote the total read cost in GC with Thus, we can simplify the notation when discussing
the resident set RC , then allocation within Hx by referring to everything in the
X cluster without the cluster subscript. For example, we use

ReadCostC ðGC ; RC Þ ¼ H
Ar ðHx Þ ðHx ; RC Þ: G ¼ ðP ; EÞ to represent the topology graph of Hx , where
x
P ¼ fP1 ; P2 ; . . . ; PN g. Similarly, P root represents the root
node of Hx , ðPi ; Pj Þ represents the shortest path between RC except that in RC0 , Hx holds l distinct shares. Thus, in
two nodes inside Hx , and R represents the resident set of RC0 , jðHx ; RC0 Þj ¼ 0. So, the read cost for read requests
Hx . Note that the simplification will be used below and in from Hx becomes zero. Also, in GC , there may be clusters
Section 5. In the situation where multiple clusters are that read from Hx . Assume that Hx is the closest cluster
considered, the original notation is used. Note that we only in RC of Hy (Hy is not in RC ). If the optimal resident set is
need to consider clusters with l or more share replicas for RC , then Hy needs to read from Hx and some other
this subproblem OISAP. clusters since Hx has less than l shares. Thus, we can
Let ReadCost (R) denote the total read cost from all the conclude
nodes in cluster Hx :
X ReadCostC ðGC ; RC Þ ReadCostC ðGC ; RC0 Þ
ReadCostðRÞ ¼ P 2H
jðPi ; R; lÞj Ar ðPi Þ: Ar ðHx Þ jðHx ; RC0 Þj and; hence;
i x
For each update in the system, the root node P root needs ReadCostC ðGC ; RC0 Þ < ReadCostC ðGC ; RC Þ:
to propagate the update to all other share holders inside Hx .
Now let us consider the update cost. Note that
Let WriteCost(R) denote the total update cost in Hx . Then
we have UpdateCostC ðGC ; RC Þ ¼ wC jC ðRC Þj. Because
RC0 and RC are actually composed of the same set
W riteCostðRÞ ¼ wC P root ; R; jRj :
of clusters, so jC ðRC0 Þj ¼ jC ðRC Þj. Also, wC is
Let Cost(R) denote the total cost of all nodes in Hx , then independent of the resident set. So, we have
UpdateCostC ðGC ; RC0 Þ ¼ UpdateCostC ðGC ; RC Þ.
CostðRÞ ¼ W riteCostðRÞ þ ReadCostðRÞ: Since RC0 has the same update cost as, but lower read
Our goal is to determine an optimal resident set R to cost than RC , so CostC ðGC ; RC0 Þ < CostC ðGC ; RC Þ. RC ,
allocate the shares in Hx , such that Cost(R) is minimized. thus, cannot be an optimal residence set. It follows that
Note that m jRj l (we will prove this in the next 8Hx in RC , jRx j l. u
t
section). In Section 5, we propose a heuristic algorithm with
a complexity of OðN 3 Þ to find the near-optimal solution for We also observe that the clusters in the resident set RC
this problem, where N is the number of nodes in the cluster. form a connected graph (which is a subgraph in GC ). This
property is formally proven in Theorem 3.2. From this
property, we can see that for resident set expansion
3 OIRSP SOLUTIONS (considering allocating share replicas to new clusters), only
In this section, we present a heuristic algorithm for OIRSP. neighboring clusters of the current resident set need to be
First (in Section 3.1), we discuss some properties that are considered. Thus, we can have a greedy approach to obtain
very useful for the design of the heuristic algorithm. In a solution. Note that, in Theorem 3.2, we assume that for
Section 3.2, we present the heuristic algorithm that decides each cluster Hx in GC , Ar ðHx Þ > 0.
which cluster should hold share replicas to minimize Theorem 3.2. The optimal resident set is a connected graph
access cost. within the general graph GC .
3.1 Some Useful Properties Proof. Assume that RC is an optimal resident set for GC and
We first show that if a cluster Hx is in RC (an optimal it is not connected. Since RC is not a connected graph,
resident set), then Hx should hold at least l share replicas (l there are two subgraphs RC1 and RC2 that are not
is the number of shares to be accessed by a read request). If connected. Without loss of generality, assume that cluster
Hx is in RC and Hx has less than l shares, then read accesses HMSC 2 RC1 and RC2 is the closest subgraph to RC1 in the
from Hx will anyway need to go to another cluster to get the update propagation minimal spanning tree of RC . Since
remaining shares. If Hx holds no share replicas, then read GC is connected, at least one path existed that connects
accesses from Hx may need to get the l shares from multiple RC1 and RC2 . Let ðRC1 ; RC2 Þ denote the path connecting RC1
clusters. These may result in unnecessary communication and RC2 in GC with the minimal distance (or minimum
overhead. The formal proof is given in Theorem 3.1. Based number of hops between RC1 and RC2 if distance is
on this property, the computation of the update and read measured by the number of hops) and let jðRC1 ; RC2 Þj
costs can be simplified. Essentially, for a cluster that is in denote the distance. Since RC1 and RC2 are disconnected,
RC , all read requests can be served locally. For a cluster that there exists a cluster Hx 2 ðRC1 ; RC2 Þ and Hx 62 RC .
is not in RC , all read requests can be forwarded to one Let us consider a new resident set RC0 such that RC0
single cluster in RC and all l shares can be obtained from is the same as RC , except that all clusters on path
that cluster. ðRC1 ; RC2 Þ are in RC0 . For each cluster Hx 2 ðRC1 ; RC2 Þ,
jðHx ; RC0 Þj ¼ 0. Together with Theorem 3.1, we know
Theorem 3.1. In a general graph GC , 8x, Hx 2 GC , jRx j ¼ 0 or t h a t ReadCostC ðGC ; RC Þ < ReadCostC ðGC ; RC0 Þ. F o r
jRx j l. each update in GC , an update propagation message is
Proof. Assume that there exists one cluster Hx in RC , such propagated from RC1 to RC2 through ðRC1 ; RC2 Þ, no matter
that jRx j < l. When the resident set is RC , a read request whether RC or RC0 is the residence set, since ðRC1 ; RC2 Þ
from Hx cannot be served locally and the remaining is the shortest path between RC1 and RC2 in GC . Thus,
shares have to be obtained from at least one other cluster UpdateCostC ðGC ; RC0 Þ ¼ UpdateCostC ðGC ; RC Þ.
in GC that holds those shares. Thus, jðHx ; RC j > 0. Let Since RC0 yields a lower read cost than and has the
us construct another resident set RC0 . RC0 is the same as same update cost as RC , we can conclude that
Fig. 2. Sample GC and SP T ðGC ; RC Þ. (a) The original GC with RC ¼ fH1 ; H2 ; H3 g. (b) Super node S and SP T ðGC ; RC Þ constructed by Build_SPT.
CostC ðGC ; RC0 Þ < CostC ðGC ; RC Þ. Thus, RC is not a cluster in RC ). Since all read requests from Hy go through
minimal residence set. And, we can conclude that the the root, say Hz , Ar ðHy Þ is added to Ar ðHz Þ0 for later use (for
optimal resident set is a connected graph in GC . u
t new resident cluster identification). In case Hy already has a
parent, the distances to S via the original parent and via Hx
Theorem 3.2 is proven based on the assumption that for are compared. If Hx offers a shorter path to S, then
each cluster Hx , Ar ðHx Þ > 0. For the case where Ar ðHx Þ ¼ 0, Hy ’s parent is reset to Hx and the corresponding adjust-
there may be multiple optimal resident sets in GC and at ments are made. To achieve a faster convergence for new
least one of them is a connected graph. The proof is similar RC identification, Hy ’s parent is also changed to Hx if
to the proof of Theorem 3.2. For any nonconnected resident Hx ’s tree root Hz has a higher value of Ar ðHz Þ0 , when the
set, a connected resident set that incurs the same or lower distances to S via Hy ’s original parent and via Hx are equal.
communication cost can always be found. The detailed algorithm for Build_SPT is given in the
following (assume that V C ðGC ; RC Þ is already identified).
3.2 A Heuristic Algorithm for the OIRSP
In the algorithm, each node Hx has several fields. Hx :root
The goal of OIRSP is to determine the optimal resident set
and Hx :parent are the root and parent clusters of Hx ,
RC in GC . GC is a general graph. Each edge in GC is
respectively. Hx :dist is the distance from Hx to Hx ’s root (at
considered as one hop. The optimal resident set problem in a
the end of the algorithm, it is the shortest distance). We also
general graph is an instance of the problem discussed in [34].
use NextðHx Þ to denote the set of Hx ’s neighbors.
It has been shown that the problem is NP-complete. Thus,
we develop a heuristic algorithm to find a near-optimal Build_SPT ðGC ; RC Þ
solution. Our approach is to first build a minimal spanning { For all Hx , Hx 2 V C ðGC ; RC Þ
tree in GC with RC being the root and then identify the { Insert Hx into Queue; Hx :root Hx ; Hx :dist 0;
cluster to be added to RC based on the tree structure. Ar ðHx Þ0 Ar ðHx Þ; }
The clusters in GC access data hosted in RC along the While ðQueue 6¼ Þ
shortest paths, and these paths and the clusters form a set { Hx Remove a node from Queue;
of the shortest path trees. Since all the nodes in RC are For all Hy , Hy 2 NextðHx Þ ^ Hy 62 RC
connected, we view them as one virtual node S. Then, S, { If (Hy is not marked as visited) then
all clusters that are not in RC , and all the shortest access { Insert Hy into Queue; Hy :dist Hx :dist þ 1;
paths form a tree rooted at S, which is denoted as Hy :parent Hx ; Hy :root Hx :root;
SP T ðGC ; RC Þ (an example of the tree is shown in Ar ðHy :rootÞ0 Ar ðHy :rootÞ0 þ Ar ðHy Þ; Mark Hy as
Fig. 2b). We develop an efficient algorithm Build_SPT to visited; }
construct SP T ðGC ; RC Þ based on the current resident set Else
RC . To facilitate the identification of a new resident cluster, { If (Hy :dist > Hx :dist þ 1 _ ððHy :dist ¼
we also define V C ðGC ; RC Þ as the vicinity set of S, where
Hx :dist þ 1Þ ^ Ar ðHy :rootÞ0 < Ar ðHx :rootÞ0 ÞÞ then
8Hx 2 V C ðGC ; RC Þ, we have Hx 62 RC and Hx is a neigh-
{ Ar ðHy :rootÞ0 Ar ðHy :rootÞ0 Ar ðHy Þ;
boring cluster of S. Note that from Theorem 3.2, we know
Hy :dist Hx :dist þ 1; Hy :parent Hx ;
that the clusters in RC are connected. Thus, we only need
Hy :root Hx :root;
to consider clusters in V C ðGC ; RC Þ when looking for a
Ar ðHy :rootÞ0 Ar ðHy :rootÞ0 þ Ar ðHy Þ; } }
potential cluster to be added to RC .
} } }
Build SP T ðGC ; RC Þ first constructs V C ðGC ; RC Þ by visit-
ing all neighboring clusters of RC . If a cluster Hx in Actually, the check for Hy :dist > Hx :dist þ 1 in the
V C ðGC ; RC Þ has more than one neighbor in RC , then one of algorithm is not necessary since a queue is used (a node
them is chosen to be the parent cluster. Next, is always visited from a neighbor with the shortest distance
Build SP T ðGC ; RC Þ traverses GC starting from clusters in to S). A sample general graph GC with current resident set
V C ðGC ; RC Þ. From a cluster Hx , it visits all Hx ’s neighboring RC ¼ fH1 ; H2 ; H3 g is shown in Fig. 2a. The corresponding
clusters. Assume that Hy is a neighboring cluster of Hx . SP T ðGC ; RC Þ is shown in Fig. 2b, where RC is represented
When Build_SPT visits Hy from Hx , it assigns Hx as Hy ’s by the super node labeled as S. When constructing
parent if Hy does not have a parent. In this case, Hy is in the SP T ðGC ; RC Þ, S’s immediate neighbors, including H4 , H5 ,
same tree as Hx , and Hy ’s tree root is set to Hx ’s (which is a H6 , H7 , H8 , and H9 , are visited first. H4 is visited twice but
H1 is selected as the parent since H4 is visited from H1 first fHMSC g. It is obvious that H2 2 ðHx ; fHMSC gÞ and H2 is
and there is no need for adjustment when it is visited the the cluster on ðHx ; fHMSC gÞ right next to HMSC , and
second time. From the clusters nearest to S, the clusters that jðHx ; H2 Þj ¼ jðHx ; fHMSC gÞj jðH2 ; fHMSC gÞj. A n y
are two hops away from S, including H10 , H11 , H12 , H13 , other path ðHx ; H2 Þ0 or ðHx ; HMSC Þ0 has a distance no
H14 , and H15 , are visited. Finally, the nodes that are further less than jðHx ; H2 Þj. With resident set fHMSC ; H2 g,
away from S are visited. ðHx ; H2 Þ will continue to be the least distance path for
We develop a heuristic algorithm to find the new cluster Hx to read from H2 in GC , and ðHx ; fHMSC ;
resident set for GC in a greedy manner. We try to find a H2 gÞ ¼ ðHx ; fHMSC gÞ jðH2 ; fHMSC gÞj. For any cluster
new resident cluster in V C ðGC ; RC Þ and, once found, update Hx that reads fHMSC g through H2 , ðHx ; fHMSC gÞ will, at
RC accordingly. The algorithm is shown below. RC is least, not increase if H2 is added into the resident set.
initialized to fHMSC g. The algorithm first constructs Then, we can easily get ReadCostC ðGC ; fHMSC ; H2 gÞ þ
SP T ðGC ; RC Þ and identifies V C ðGC ; RC Þ. Then, a cluster Ar ðH2 Þ0 ReadCostC ðGC ; fHMSC gÞ.
Hy with the highest Ar ðHy Þ0 is selected. If Ar ðHy Þ0 > wC , According to the heuristic resident set algorithm, we
then Hy is added to RC . If Ar ðHy Þ0 wC , then the algorithm know Ar ðH2 Þ0 > wC . Thus, CostC ðGC ; fHMSC ; H2 gÞ
terminates since no other nodes can be added to RC while CostC ðGC ; fHMSC gÞ ¼ UpdateCostC ðGC ; fHMSC ; H2 gÞ
reducing the access communication cost. Note that, in each UpdateCostC ðGC ; fHMSC gÞ þ ReadCostC ðGC ; fHMSC ;
step, only one cluster can be added into RC because H2 gÞ ReadCostC ðGC ; fHMSC gÞ wC Ar ðH2 Þ0 jðH2 ;
SP T ðGC ; RC Þ and Ar ðHx Þ0 changes when RC changes. fHMSC gÞj < 0.
Step 2. Assume that CostC ðGC ; fHMSC ; H2 ; . . . ; Hk gÞ <
RC fHMSC g; CostC ðGC ; fHMSC ; H2 ; . . . ; Hk 1 gÞ. W e s h o w t h a t
Repeat CostC ðGC ; fHMSC ; H2 ; . . . ; Hk gÞ > CostC ðGC ; fHMSC ;
{ Build SP T ðGC ; RC Þ; H2 ; . . . ; Hkþ1 gÞ, with k < n. It can be seen that the proof is
Select a cluster Hy , where Hy has the maximum the same as above and we will not show it here.
Ar ðHy Þ0 among all clusters in V C ðGC ; RC Þ; By induction, we know that CostC ðG; fHMSC ; H2 ; . . . ;
If Ar ðHy Þ0 > wC RC RC [ fHy g; } Hn gÞ < CostC ðGC ; fHMSC ; H2 ; . . . ; Hn 1 gÞ. Thus, CostC
r 0
Until ðA ðHy Þ w Þ C ðGC ; RC Þ < CostC ðGC ; fHMSC gÞ. Also, from the induction
process, we can conclude that every time a new cluster Hi
Now, we analyze the complexity of the heuristic
joins RC , the communication cost decreases, i.e.,
resident set algorithm. Build_SPT has a time complexity
CostC ðGC ; RC ði 1ÞÞ < CostC ðGC ; RC Þ. u
t
OðP degÞ, where P is the number of clusters in GC and
deg is the maximal degree of vertexes (clusters) in GC .
Finding a cluster Hy in V C ðGC ; RC Þ with the highest As shown in Fig. 3, the heuristic resident set algorithm is
Ar ðHy Þ0 can be done when building SP T ðGC ; RC Þ. Thus, not always optimal for a general graph GC . We expect that
the time complexity for the heuristic resident set algorithm the heuristic algorithm will obtain optimal RC in a tree
is OðjRC j P degÞ. Note that the final resident set RC graph. We show this in Theorem 3.4. In Lemma 3.1, we first
computed by Build_Tree is not always optimal [32]. show that when a tree network is considered, the resident
The heuristic resident set algorithm works by adding a set computed by the heuristic algorithm is a subset of the
candidate cluster Hx in GC into RC at each step, with optimal resident set.
Ar ðHx Þ0 > wC . By adding cluster Hx , the read cost is Lemma 3.1. Consider a tree network TMSC rooted at HMSC . Let
reduced by at least Ar ðHx Þ0 , and the update cost is increased RC ðTMSC Þ0 denote the optimal resident set in TMSC . Let
by wC . Thus, the total cost of GC with resident RC is less RC ðTMSC Þ be the resident set computed by the heuristic
than that of GC with initial resident set fHMSC g, if jRC j > 1. algorithm. If Hx 2 RC ðTMSC Þ, then Hx 2 RC ðTMSC Þ0 .
This will be shown in Theorem 3.3.
Proof. This is also to show that RC ðTMSC Þ RC ðTMSC Þ0 .
Theorem 3.3. In a general graph GC , if jRC j > 1, then Assume that RC ðTMSC Þ 6 RC ðTMSC Þ0 . Let RC ðTMSC Þ \
CostC ðGC ; RC Þ < CostC ðGC ; fHMSC gÞ. Furthermore, every RC ðTMSC Þ0 ¼ S1 , and S1 6¼ because HMSC 2 S1 . Let
time a new cluster Hx (Hx satisfies the cost constraint) is added RC ðTMSC Þ S1 ¼ S2 , then S2 6¼ . We will show that
to current resident set RC0 ðRC0 RC Þ, the communication cost there is another resident set RC ðTMSC Þ00 exists such that
decreases, i.e., CostC ðGC ; RC0 [ fHx gÞ < CostC ðGC ; RC0 Þ. CostC ðTMSC ; RC ðTMSC Þ00 Þ < CostC ðTMSC ; RC ðTMSC Þ0 Þ. Let
Proof. According to Theorem 3.1, 8x, Hx 2 RC , jRx j l. RC ðTMSC Þ00 ¼ RC ðTMSC Þ0 [ fHy g, where Hy 2 S2 and Hy
The algorithm works by adding one cluster at a is a neighboring cluster of S1 . Let RCtemp ðTMSC Þ denote the
t i m e . L e t RC ¼ fH1 ; H2 ; . . . ; Hn g, jRC j ¼ n a n d resident set just before cluster Hy join in RC ðTMSC Þ, we
H1 ¼ HMSC . Assume that Hi is added at the ði 1Þth know that Ar ðHy Þ0 > wC . Since Hy 2 S2 , let node Hz be
step to RC . If we show that after adding each cluster, the parent cluster of Hy in TMSC , then Hz 2 RC ðTMSC Þ
the cost reduces, then we can conclude that and Hz joins the resident set RC ðTMSC Þ before Hy . As
CostC ðGC ; RC Þ < CostC ðGC ; fHMSC gÞ. We use induction we know in TMSC , ðHy ; Hz Þ is the unique path that
to prove this. clusters read RCtemp ðTMSC Þ through Hy , and RCtemp ðTMSC Þ.
Step 1. We show that CostC ðGC ; fHMSC ; H2 gÞ <
According to Theorem 3.2, if Hy 2 RC ðTMSC Þ0 , then no
CostC ðGC ; fHMSC gÞ. According to the algorithm,
cluster in the subtree rooted at Hy belongs to RC ðTMSC Þ0 .
H2 2 V C ðGC ; fHMSC gÞ, then UpdateCostC ðGC ; fHMSC ; For each access to RC ðTMSC Þ0 initiated from any cluster
H2 gÞ ¼ UpdateCostC ðGC ;fHMSC gÞ þ wC jðH2 ;fHMSC gÞj. in the subtree, it has to be forwarded to RC ðTMSC Þ0
For each cluster Hx that reads fHMSC g through H2 , by cluster Hy . We know Ar ðHy Þ0 > wC . Thus, costC ðTMSC ;
ðHx ; fHMSC gÞ is the shortest path in GC from Hx to RC ðTMSC Þ00 Þ < costC ðTMSC ; RC ðTMSC Þ0 Þ. It follows that
connected and there exists a cluster Hx 2 S3 such that

Hx 2 V C ðTMSC ; S1 Þ. According to the proof given in
case 2, such Hx does not exist. Also, there exists a
cluster Hy 2 S2 such that Hy 2 V C ðTMSC ; S1 Þ. According
to the proof given in case 2, such Hy does not exist.
All together, we can conclude that such a RC ðTMSC Þ0
does not exist in TMSC .
Note that in a tree network, the Ar ðHx Þ0 of each
cluster Hx can be computed by traversing the tree, before
the heuristic algorithm starts. Also, the shortest access
paths of those nonresident clusters do not change. Thus,
the SPT only need to be built just once. Therefore, the
time complexity of the heuristic resident set algorithm
considering a tree network is OðP Þ. u
t
4 OISAP SOLUTIONS
Now, we only consider the cost inside a single cluster Hx .
As discussed in Section 2, the topology of Hx is a tree,
denoted as T . For simplicity, we define the distance of each
edge in T uniformly as one hop. In the following, we first
show two important properties of the OISAP problem with
a tree topology. Then, we give a heuristic algorithm to
decide the numbers of shares needed in Hx and where to
place them.
If the Hx ’s resident set R is not connected, then R consists
of multiple disconnected subresident set R1 ; R2 ; . . . ; Rn ,
where n > 1, and each subresident set is connected. We say
R is j þ connected in Hx , if and only if minðjRi jÞ j, where
j > 0, Ri , for all i n, are subgraphs in Hx , and jRi j is the
number of server nodes in Ri . We define Ri <pos Rj as
f o l l o w s : I f Ri <pos Rj , t h e n 9Py , Pz 2 Hx , w h e r e
Fig. 3. (a) The performance impact of graph size. (b) The performance Py 2 Ri ^ Pz 2 Rj , such that Pz is an ancestor of Py in T .
impact of graph degree. (c) The impact of update/read ratio. Informally, nodes in Rj are closer to the root than nodes in
Ri . Otherwise, Ri pos Rj .
Hy 2 RC ðTMSC Þ0 . Therefore, we conclude that there exists Theorem 4.1. In a tree network, the optimal resident set R in Hx
no such Hy , and it follows that RC ðTMSC Þ RC ðTMSC Þ0 .t
u is lþ connected.
Theorem 3.4. The resident set RC ðTMSC Þ computed by our Proof. The proof is omitted due to space limitation. Please
algorithm is optimal in a tree network TMSC rooted at HMSC . refer to [32] for the full proof. u
t
The corresponding time complexity is OðP Þ.
Proof. According to the algorithm, nodes in RC ðTMSC Þ are Next, we discuss another important property that helps
connected and HMSC 2 RC ðTMSC Þ. Suppose there exists the construction of the allocation algorithm. We show that if
another resident set RC ðTMSC Þ0 which is optimal, i.e., the constraint jRj m is removed, P root should be in R and
costC ðTMSC ; RC ðTMSC Þ0 Þ < costC ðTMSC ; RC ðTMSC ÞÞ. W e R is a connected subgraph in T . This property is shown in
prove that such a RC ðTMSC Þ0 does not exist in TMSC . Theorem 4.2. Based on this property, a share allocation
algorithm can start with R ¼ fP root g, and add one neigh-
Case 1. RC ðTMSC Þ0 RC ðTMSC Þ. According to
boring node in each step.
Theorem 3.2 and the heuristic algorithm, both
RC ðTMSC Þ and RC ðTMSC Þ0 form connected graphs. Theorem 4.2. Without the constraint jRj m, we have P root 2
According to Lemma 3.1, we can easily get R and R is a connected subgraph.
RC ðTMSC Þ0 6 RC ðTMSC Þ. Proof. Since the updates are propagated from P root to nodes
Case 2. RC ðTMSC Þ RC ðTMSC Þ0 . Let RC ðTMSC Þ0 in Hx that hold share replicas. According to the cost
R ðTMSC Þ ¼ S1 . Since RC ðTMSC Þ and RC ðTMSC Þ0 are both
C
models defined in Section 2.2, if a node on the path from
connected, let S2 ¼ S1 \ V C ðTMSC ; RC ðTMSC ÞÞ, then P root to the residence set hosts a share, the update cost
S2 6¼ . For each cluster Hx 2 S2 , Hx is the root of the does not increase and the read cost decreases. Thus, each
subtree that reads RC ðTMSC Þ through Hx . Since of these nodes, including P root , should be in R. We can
Ar ðHx Þ0 wC for all Hx 2 V C ðTMSC ; RC ðTMSC ÞÞ. Thus, use similar proof given in Theorem 3.2 to show that the
CostCðTMSC ; RC ðTMSC ÞÞ CostCðTMSC ; RCðTMSC Þ[fHx gÞ. residence set R is a connected graph in T if the constraint
S o , RC ðTMSC Þ 6 RC ðTMSC Þ0 . Ot h e r w i s e : L e t S1 ¼ jRj < m is not considered. Thus, the theorem follows. t u
RC ðTMSC Þ \ RC ðTMSC Þ0 . S1 6¼ because HMSC 2 S1 . Let Now, we present the algorithm SDP-Tree, which
RC ðTMSC Þ0 S1 ¼ S2 , RC ðTMSC Þ S1 ¼ S3 . Then, S2 6¼ determines the ðm; kÞ secrete sharing residence set in T .
and S3 6¼ , because RC ðTMSC Þ and RC ðTMSC Þ0 are Initially, the residence set R only contains node P root .
SDP-Tree first adds nodes into R. We call this the node Ar ðPi Þ jl 1 þ ðPi ; RÞj. Adding Pi to R decreases
joining phase. Similar to Build SP T ðGC ; RC Þ, a node that is the read cost of each node in Ti by one and does
not in the resident set and, if allocated a share replica, can not change the read cost of any node Pj 62 Ti . Also,
maximally reduce the cost compared with all other nodes, is adding Pi to R increases the update propagation cost
selected and added to the resident set. Note that since each of the root by one. According to the cost models
cluster contains either no share or at least l shares as shown defined in Section 2.2, we have costðR [ fPi gÞ ¼
in Theorem 3.1, SDP-Tree guarantees that the result set P
costðRÞ Pj 2Ti ReadCostðPj Þ þ wC . S i n c e Ti Tk ,
contains at least l shares. In the second phase, SDP-Tree P P
Pj 2Ti ReadCostðPj Þ Pj 2Tk ReadCostðPj Þ, we have
removes nodes from the residence set and it is called the
diffðPk Þ diffðPi Þ. u
t
node removal phase. Node removal phase removes nodes
from the resident set constructed during the first phase, if it Theorem 4.3. Let RS denote the residence set computed by the
contains more than m nodes. A node, if removed from R node joining phase of SDP-Tree. If the constraint jRj < m is
will cause minimum increase in access cost, is selected and removed, then RS is the optimal resident set such that
removed from the resident set. The process continues until costðRSÞ is minimal.
only m nodes are left in R. Note that any removed nodes Proof. Assume that RS is not optimal. Suppose that there
will not cause the violation of the l þ connectivity property exists an optimal resident set RS 0 such that
and will not result in more than m=l disconnected costðRSÞ < costðRS 0 Þ. Two cases exist.
subresident sets. The OISAP is given as follows: In the Case 1. RS RS 0 . According to Theorem 4.2 and the
algorithm, VN is the set of neighboring nodes of resident set SDP-Tree algorithm, RS and RS 0 are both connected,
R. Ar ðPi Þ0 is the number of total read requests issued from and any node in RS 0 RS must be a descendant of
nodes inPthe subtree Ti rooted at node Pi , i.e., some node in RS. Otherwise, SDP-Tree would have
Ar ðPi Þ0 ¼ Pj 2Ti Ar ðP j Þ. MaxR is a temporary variable. added the node into RS. Let Py denote a node such that
SDP-Tree{ its parent node is in RS and Py 62 RS while Py 2 RS 0 .
R fP root g; VN fchild nodes of P root g; MaxR 0; According to SDP-Tree, if Py is not added in RS, it is
while ðVN ! ¼ Þ only because that adding Py will increase access cost.
{ For 8Pi 2 VN From Lemma 4.1, we know that adding any subset of
{ If Ar ðPi Þ0 > MaxR descendants of Py together with Py would also increase
the cost. Thus, there exists no RS 0 such that RS RS 0 .
{ MaxR Ar ðPi Þ0 ;
Case 2. RS 6 RS 0 ^ RS 6¼ RS 0 . Let Py be the first node
X Pi ;
that SDP-Tree adds, such that Py 2 RS and Py 62 RS 0 . Let
} } // find the node with highest Ar ðPi Þ0
R0 denote the residence set that SDP-Tree computed
If ðMaxR wC Þ
before adding Py . Note that R0 6¼ (because R0 contains at
if ðjRj < lÞ { R R [ fXg; }
least P root ), and R0 RS 0 . According to Theorem 4.2, RS 0
else {X null; delete all nodes in VN ;}}
is a connected subgraph in T . Since Py 62 RS 0 , then no node
else {R R [ fXg; Y parent node of X;
in the subtree rooted at Py should be in RS 0 . According to
Ar ðYÞ0 Ar ðYÞ0 Ar ðXÞ0 ; }
the SDP-Tree algorithm, if Py 2 RS, then diffðPy Þ is
delete X from VN ;
minimal among the neighboring nodes of R0 . Two cases
if ðX 6¼ nullÞ { insert all child nodes of node X into VN ;}}
should be considered. 1) costðR0 [ fPy gÞ < costðR0 Þ. The
while ðjRj > mÞ
node Py is a neighbor of some node in R0 . This means
{ MaxR 1;
costðRS 0 [ fPy gÞ < costðRS 0 Þ, which is a contradiction to
For 8Pi 2 R
the assumption. 2) jR0 j < l, diffðPy Þ 0, and diffðPy Þ is
{if (removing Pi will retain the l þ connectivity and
minimal among the neighboring nodes of R0 . According to
does not result in more than m=l disconnected
Lemma 4.1, for any node Px such that Px 2 RS 0 and
subresident sets)
Px 62 R0 , 0 diffðPy Þ diffðPx Þ. If for any node Px such
{ If ðAr ðPi Þ0 < MaxRÞ
that Px 2 RS 0 and Px 62 RS, diffðPy Þ ¼ diffðPx Þ, then
{ MaxR Ar ðPi Þ0 ;
there exists no Pz such that Pz 2 RS 0 and diffðPz Þ
X Pi ; }} // find the node with lowest Ar ðXÞ0
diffðPx Þ. Otherwise, Px is added into RS before Pz . Thus,
R R fXg; }}}
costðRSÞ costðRS 0 Þ, which is contradictory to the
Next, we show a property of the solution obtained by the assumption. If there exists some node Px such that
SDP-Tree algorithm in Theorem 4.3. In Lemma 4.1, we first Px 2 RS 0 , Px 62 RS, and diffðPy Þ < diffðPx Þ, then let Pz
define how to compute the cost difference when a node is be a leaf node of the tree composed only by nodes in RS 0
added into the resident set. and Pz is a descendant of Px . According to Lemma 4.1,
Lemma 4.1. Let Ti denote P the subtree rooted at Pi . diffðPy Þ < diffðPx Þ diffðPz Þ. Now, construct another
costðR [ fPi gÞ ¼ costðRÞ Pj 2Ti ReadCostðPj Þ þ wC , resident set RS 00 such that RS 00 ¼ RS 0 [ fPy g fPz g. We
Pi is a neighboring node of R. Also, let diffðPi Þ ¼
where P know that costðRS 00 Þ < costðRS 0 Þ, which contradicts the
wC Pj 2Ti ReadCostðPj Þ, then diffðPk Þ diffðPi Þ if Pk is assumption. u
t
an ancestor of Pi in T . When the number of nodes in the resident set computed
Proof. For a node Pi , Pi 2 T , ReadCostðPi Þ ¼ Ar ðPi Þ by the node joining phase is greater than m, we need to
jðPi ; R; lÞj. According to Theorem 4.1, ReadCostðPi Þ ¼ remove some nodes in the set. The greedy removal may not
be optimal. Also, with l < 2, the resident set computed by cluster; and 3) the update/read ratio, which is the ratio of the
SDP-Tree algorithm is always optimal even if the number of total number of update requests in the entire system to the
nodes in the resident set computed by the node joining average number of read requests issued from a single
phase is greater than m. cluster (these are the requests each cluster needs to process).
Now consider the time complexity of SDP-Tree algo- The results are shown in Fig. 3, in which HEU, RKR, NR,
rithm in a tree T with N nodes. The node joining phase and FR denote the OIRSP heuristic algorithm, the rando-
traverses all nodes in T and the time complexity is OðNÞ. In mized K-replication, the no-replication allocation, and the
the node removal phase, it needs to remove jRj m nodes, complete replication algorithms, respectively.
which needs jRj m steps. In each step, it needs to select Fig. 3a shows the impact of graph size on the performance
Pi from R, the worst case of which could be OðNÞ. For each of the four algorithms. The parameters are set as follows:
tested Pi , it needs to check whether the removal of Pi cluster size ¼ 100, which means that there are 100 nodes in
retains the l þ connectivity, which takes OðNÞ time in each cluster, graph degree ¼ 5, and update=read ¼ 2. The
worst case. So, the complexity of the node removal phase is results show that the OIRSP heuristic algorithm incurs much
OðN 3 Þ. Thus, the overall time complexity of the SDP-Tree lower communication cost than other replication strategies.
algorithm is OðN 3 Þ. Also, it can be seen that with a larger graph size, the OIRSP
heuristic algorithm achieves better performance compared
to the no-replication allocation strategy. The reason is
5 EXPERIMENTAL STUDY obvious. With larger graph size, the number of clusters that
We conduct experiments to evaluate the performance of the need replicas increases. Allocating share replicas to these
heuristic algorithms for secret share allocation. The OIRSP clusters will reduce communication cost, and hence, the
heuristic algorithm is compared with the randomized heuristic algorithm shows a better performance than the no-
K-replication, no-replication allocation, and complete repli- replication allocation strategy.
cation strategies, to study its performance and effectiveness The effect of graph degree is shown in Fig. 3b. The
on reducing communication cost at the cluster level. In the other parameters are set as follows: cluster size ¼ 100,
randomized K-replication, share replicas are randomly graph size ¼ 80, and update=read ¼ 2. From the results, we
allocated among K clusters, where K is the number of can see that the performance gain of the heuristic algorithm
clusters holding replicas computed by the OIRSP heuristic becomes less significant with the increasing graph degree.
algorithm. In the complete replication strategy, share This is because the graph becomes more complete, and the
replicas are allocated in every cluster. In the no-replication distances between nodes become smaller, when the graph
allocation strategy, there is no replication in any cluster. degree increases. When the graph degree becomes large, the
The SDP-tree algorithm for OISAP is compared with communication cost for the complete replication strategy
the optimal allocation algorithm and randomized drops more significantly than that for other algorithms.
M-replication, to see how well the SDP-tree algorithm When graph degree 20, the communication cost for
performs in terms of reducing communication cost within complete replication becomes stable and it stays as twice of
a cluster. In the randomized M-replication, M shares are that of the heuristic algorithm. With update=read ¼ 2, most
randomly allocated among the nodes in a cluster, where clusters will not be allocated with share replicas, thus the
M is computed by SDP-tree and it is the number of nodes result of the heuristic algorithm is closer to (but is still better
that hold shares. than) the other two strategies. Compared to the complete
The underlying network topology for the experimental replication strategy, the performance of our heuristic
studies is created by using a topology generator, Inet [11]. algorithm is much better when the graph degree is small.
Inet has a lower bound on the total number of nodes in the This is because more clusters get unneeded replicas in the
network. We removed the bound so that the graph with complete replication strategy and the update cost increases
different number of clusters (or nodes) can be created. We significantly.
have also modified the generator on read/write request The effect of update/read ratio is shown in Fig. 3c.
generation for each node. The numbers of read and write The parameters are set as follows: cluster size ¼ 100,
requests on the nodes in the system are generated randomly graph size ¼ 80, and graph degree ¼ 5. With increasing
following a uniform distribution. update/read ratio, fewer clusters should get replicas. So, the
The metric we consider is the communication cost, which communication cost of the complete replication strategy
is the product of the number of messages and the number of increases rapidly and becomes far worse than the other two
hops along the message propagation path. To avoid biased replication strategies.
access patterns and topology structures, we repeat the
experimental steps 100 times. The final result is the average 5.2 The Efficiency of the OISAP SDP-Tree Algorithm
of the 100 trials. The performance of SDP-tree algorithm is compared with the
optimal allocation algorithm and the randomized
5.1 Performance of the OIRSP Heuristic Algorithm M-replication algorithm. In the experiments, the trees are
In this section, we compare the performance of the OIRSP generated randomly by using the topology generator with
heuristic algorithm with the randomized K-replication, no- changing N, D, and read/update ratio, where N is the total
replication allocation, and complete replication strategies. number of nodes in the cluster, D is the maximum node
We study the impacts of three factors: 1) the graph size, degree, and read/update is the ratio of the average number of
which is the number of clusters in the system; 2) the graph read requests in the cluster to the total number of update
degree, which is the average number of neighbors of a requests in the system. Two configurations are considered:
Fig. 4. (a) The impact of l with read=update ¼ 3. (b) The impact of l with read=update ¼ 30. (c) The impact of m with read=update ¼ 3. (d) The impact
of m with read=update ¼ 30.
1) N ¼ 30, D ¼ 5, read=update ¼ 3 and 2) N ¼ 30, D ¼ 5, the subtree) is higher than the total number of update
read=update ¼ 10. We vary m and l to evaluate their impact frequency in the entire cluster. Thus, in average, a certain
on the performance of the algorithms. A larger m value number of shares hosted in the cluster are sufficient to
results in higher availability if some of the share holders are minimize the communication cost, with a specific read/
not available or compromised, where a larger l value update ratio. With a reasonable read/update ratio, the number
achieves better data confidentiality. Thus, different m and of shares required to minimize access cost inside a cluster is
l values could be chosen based on the requirements of the usually small (note that a small number of shares can
data. Note that we only show the results with N ¼ 30, partition the cluster into a much larger number of subtrees
because the computation cost for obtaining the optimal such that all of them are rooted at the neighboring nodes of
solutions is prohibitive. The results are shown in Fig. 4, in the resident set, i.e., the nodes host shares). Thus, a small
which HEU denotes the heuristic algorithm, OPT denotes value of m would be big enough to provide sufficient shares
the optimal solution, and RMR denotes the random to minimize the communication cost inside a cluster. In
M-replication algorithm. other words, the value of m has little impact on the access
From Fig. 4, we can see that the communication costs cost inside a cluster.
using RMR are always the highest. By using RMR, the share
replicas are randomly allocated. So, the shares may not
6 STORAGE LIMITATIONS AND LOAD BALANCING
be close to clients with most frequent accesses. For all
configurations, the heuristic algorithm obtains near-optimal Replication is a natural solution for reducing the commu-
or optimal solutions. In fact, for the worst individual case we nication cost (as we have discussed) as well as sharing the
have observed, the cost obtained by the heuristic algorithm access load. In peer-to-peer data grids, replica can be placed
is only about 10 percent higher than the optimal algorithm. on widely distributed nodes to achieve better access
performance and load sharing. In cluster-based data grids,
In most cases (about 75 percent), the heuristic algorithm can
caching data on widely distributed nodes is necessary (in
obtain the optimal solutions.
addition to replication on cluster nodes) to achieve
From Figs. 4a and 4b, it can be seen that with increasing l,
improved access performance and load sharing. Data
the communication costs of all solutions increase sharply.
partitioning can contribute to reduced storage cost. It has
With higher l, a read or update request needs to access a been shown that erasure coding-based schemes can greatly
larger number of nodes that host shares, and, hence, incurs reduce the overall storage cost and effectively share the
a higher access cost. Figs. 4c and 4d show the impact of the storage consumption [16], [24], [37]. However, with these
m values on the performance gains of the three algorithms. schemes, it is still possible to have unbalanced access load
As can be seen, in general, m has little impact to the access or storage requirements due to very unbalanced access
performance. Only when the extreme case is considered, patterns. There may be too many requests inside a cluster or
with read=update ¼ 30 (most requests are read requests) a server inside a cluster may be a hot spot for many data
and m changes from 30 to 3, we can see the effect of objects. In these cases, it is necessary to adapt the algorithms
increasing m to the access performance. This is because the proposed in this paper to bound the load and storage cost.
total number of read access frequency of each subtree (i.e., Let CAPxload and CAPx;i load
denote the access load thresh-
the total number of read requests issued by the nodes inside old for cluster Hx , and server node Px;i , respectively.
CAPxload and CAPx;i load

can be determined by studying actual [20], [37]. In both OceanStore [16] and PASIS [37], secret
storage
access patterns. Let CAPxstorage and CAPx;i denote the sharing and erasure coding schemes are used for data
storage threshold for cluster Hx and server node Px;i , storage to achieve survivability and security. However, they
respectively. Note that CAPxstorage and CAPxload can be do not focus on data placement to achieve access efficiency
storage load
calculated based on CAPx;i and CAPx;i . Also, let storage (especially on consumed bandwidth), although perfor-
denote the storage cost of storing a data share. The mance metrics, such as CPU time, and storage cost, latency
algorithms can be adapted P as follows: Consider P the OIRSP (multi), etc., are indeed their major concerns. In [20], the
heuristic algorithm. If Hx 2RC CAPxload < Hx 2H C Ar ðHx Þ þ authors attempt to model the data assurance measures of
W C jRC j or the current storage space used in Hx is greater different allocations and use the metric to guide share
than CAPxstorage l storage, then we choose a cluster Hy to allocation. Data assurance is defined as the probability that
be the substitute candidate, where Hy is a neighboring the data is not compromised. They consider a two-level
cluster of Hx , Ar ðHy Þ0 ¼ maxHj ;Hj is Hx 0 s neighbor Ar ðHj Þ0 and Hy network topology where a system is divided into clusters. It
has sufficient storage space. By doing so, the increase in assumes that the probability that the data shares are
access communication cost in response to storage/access compromised when sent cross clusters is higher than that
overload is minimized. If no such cluster exists, then when transmitted within the cluster or to the clients. Also,
another feasible cluster closest to the current resident set when data is secret shared but not replicated, the data
can be chosen. Consider OISAP SDP-tree algorithm. If a assurance level is higher than that when data is replicated
node Px;i is selected as a candidate site and if its current but not secret shared. To achieve better data assurance, a
storage
storage usage is greater than CAPx;i storage or its distributed share allocation algorithm is presented to
current load is greater than CAPx;i storage
Ar ðPx;i Þ0 , then a dynamically allocate the original shares to different subnet-
neighboring node of Px;i , namely, Px;j , whose current works based on the client read and write patterns. The
storage
storage usage is less than CAPx;j storage and its algorithm converges to an optimal allocation that yields
current load is less than CAPx;i load
Ar ðPx;j Þ0 , is selected as maximal data assurance. It simply moves data shares to the
the substitute candidate. clusters where there are more access demands. Performance
issues, such as access communication cost and access
latency, are not considered in this paper. The work in [17]
7 RELATED WORK considers a secure data storage system that offers various
Replication techniques are frequently used to improve data levels of security guarantees, assuming data are secret
availability and reduce client response time and communica- shared. The entire set of shares are replicated and statically
tion cost. One major advantage of replication is performance distributed to the nodes in the network. It focuses on secure
improvement, which is achieved by moving data objects close access protocols rather than on replica placement.
to clients. In full replication [34], [35], all servers keep a In this paper, we consider the replica placement problem
complete set of the data objects. This is frequently not feasible in data grids where critical data objects are partitioned to
for a large data set. More importantly, full replication assure data confidentiality and integrity. The replicas of
schemes might incur unnecessary communication overhead partitioned shares are dynamically allocated to improve
when both read and update accesses are considered. In partial access performance. Our approach minimizes the access
replication, each server maintains a subset of data objects, cost of partitioned data in data grids, while it ensures the
depending on the tradeoff of the read and update costs. For required data confidentiality and integrity. It can be
both full and partial replication, an important issue is where considered that our work complements the work in [20]
to place the replicas. Many research efforts have been devoted in such a way that one focus on the performance issues and
to the optimal placement of distributed files, data objects, and the other focus on the security assurance issues.
other network services. Facility location [12] problem is one
big branch. It considers the optimal placement of one file or
data object (set) and supports read operations only. Many 8 CONCLUSION AND FUTURE RESEARCH
research works are done along this direction, such as We have combined data partitioning schemes (secret sharing
p-median problem [13] and the approximate p-median scheme or erasure coding scheme) with dynamic replication
problem [1]. Another branch of research on optimal data to achieve data survivability, security, and access perfor-
placement is the file allocation (optimal resident set) problem mance in data grids. The replicas of the partitioned data need
[12]. It considers both read and update operations. A lot of to be properly allocated to achieve the actual performance
research efforts have been devoted to this research branch, the gains. We have developed algorithms to allocate correlated
optimal file allocation problem in tree networks, general data shares in large-scale peer-to-peer data grids. To support
graph networks, completely connected networks, and ring scalability, we represent the data grid as a two-level cluster-
networks, respectively [34], the ADR algorithm [35], and the based topology and decompose the allocation problem into
optimal placement of k replicas in a tree network [12]. two subproblems: the OIRSP and OISAP. The OIRSP
All the replica placement works discussed above only determines which clusters need to maintain share replicas,
consider the allocation of one single data object, which is and the OISAP determines the number of share replicas
equivalent to considering multiple independent data needed in a cluster and their placements. Heuristic algo-
objects. None of them addressed the allocation of multiple rithms are developed for the two subproblems. Experimental
dependent objects such as data that are partitioned into studies show that the heuristic algorithms achieve good
multiple shares. Recently, some research works have been performance in reducing communication cost and are close to
conducted on distributed secure data systems [16], [17], optimal solutions.
Several future research directions can be investigated. [18] J.H. Lala, Foundations of the Intrusion Tolerant Systems OASIS.
IEEE CS, ISBN 076952057X.
First, the secure storage mechanisms developed in this [19] V. Matossian and M. Parashar, “Enabling Peer-to-Peer Interac-
paper can also be used for key storage. In this alternate tions for Scientific Applications on the Grid,” Proc. Ninth Int’l
scheme, critical data objects are encrypted and replicated. Euro-Par Conf. (Euro-Par), 2003.
The encryption keys are partitioned and the key shares are [20] A. Mei, L.V. Mancini, and S. Jajodia, “Secure Dynamic
Fragment and Replica Allocation in Large-Scale Distributed
replicated and distributed. To minimize the access cost, File Systems,” IEEE Trans. Parallel and Distributed Systems,
allocation of the replicas of a data object and the replicas of vol. 14, no. 9, 2003.
its key shares should be considered together. We plan to [21] N. Nagaratnam, P. Janson, J. Dayka, A. Nadalin, F. Siebenlist,
V. Welch, I. Foster, and S. Tuecke, The Security Architecture for
construct the cost model for this approach and expand our Open Grid Services, Version 1, 2002.
algorithm to find best placement solutions. Also, the two [22] www.gloriad.org/gloriad/projects/project000053.html, 2008.
approaches (partitioning data or partitioning keys) have [23] V. Paxson, “End-to-End Routing Behavior in the Internet,” IEEE/
pros and cons in terms of storage and access cost and have ACM Trans. Networking, vol. 5, no. 5, pp. 601-615, 1997.
[24] M. Rabin, “Efficient Dispersal of Information for Security, Load
different security and availability implications. We plan to Balancing, and Fault Tolerance,” J. ACM, vol. 36, no. 2, 1989.
investigate their tradeoffs and some preliminary analysis [25] K. Ranganathan and I. Foster, “Identifying Dynamic Replication
results are available in [38]. Moreover, it may be desirable to Strategies for a High Performance Data Grid,” Proc. Second Int’l
Workshop Grid Computing, 2001.
consider multiple factors for the allocation of secret shares [26] M. Reiter and P. Rohatgi, “Homeland Security,” IEEE Internet
and their replicas. Replicating data shares improves access Computing, 2004.
performance but degrades security. Having more share [27] A. Samar and H. Stockinger, “Grid Data Management Pilot
replicas may increase the chance of shares being compro- (GDMP): A Tool for Wide Area Replication,” Proc. IASTED Int’l
Conf. Applied Informatics (AI), 2001.
mised. Thus, it is desirable to determine the placement [28] A. Shamir, “How to Share a Secret,” Comm. ACM, vol. 22, 1979.
solutions based on multiple objectives, including perfor- [29] G. Singh, S. Bharathi, A. Chervenak, E. Deelman, C. Kesselman,
mance, availability, and security. M. Manohar, S. Patil, and L. Pearlman, “A Metadata Catalog
Service for Data Intensive Applications,” Proc. ACM/IEEE Conf.
Supercomputing (SC), 2003.
[30] H. Stockinger, “Distributed Database Management Systems and
REFERENCES the Data Grids,” Proc. 18th IEEE Symp. Mass Storage Systems,
[1] S. Arora, P. Raghavan, and S. Rao, “Approximation Schemes for 2001.
Euclidean k-Medians and Related Problems,” Proc. 30th ACM [31] B.M. Thuraisingham and J.A. Maurer, “Information Surviva-
Symp. Theory of Computing (STOC), 1998. bility for Evolvable and Adaptable Real-Time Command and
[2] M. Baker, R. Buyya, and D. Laforenza, “Grids and Grid Control Systems,” IEEE Trans. Knowledge and Data Eng., vol. 11,
Technology for Wide-Area Distributed Computing,” Software- no. 1, Jan. 1999.
Practice and Experience, 2002. [32] M. Tu, “A Data Management Framework for Secure and
[3] A. Chervenak, E. Deelman, I. Foster, L. Guy, W. Hoschek, Dependable Data Grid,” PhD dissertation, Univ. of Texas at
C. Kesselman, P. Kunszt, M. Ripeanu, B. Schwartzkopf, Dallas, http://www.utdallas.edu/~tumh2000/ref/Thesis-Tu.pdf,
H. Stockinger, and B. Tierney, “Giggle: A Framework for July 2006.
Constructing Scalable Replica Location Services,” Proc. ACM/IEEE [33] http://www.whitehouse.gov/reports/katrina-lessons-learned/,
Conf. Supercomputing (SC), 2002. 2008.
[4] Y. Deswarte, L. Blain, and J.C. Fabre, “Intrusion Tolerance in [34] O. Wolfson and A. Milo, “The Multicast Policy and its Relation-
Distributed Computing Systems,” Proc. IEEE Symp. Research in ship to Replicated Data Placement,” ACM Trans. Database Systems,
Security and Privacy, 1991. vol. 16, no. 1, 1991.
[5] http://csepi.utdallas.edu/epc_center.htm, 2008. [35] O. Wolfson, S. Jajodia, and Y. Huang, “An Adaptive Data
[6] I. Foster and A. Lamnitche, “On Death, Taxes, and Convergence of Replication Algorithm,” ACM Trans. Database Systems, vol. 22,
Peer-to-Peer and Grid Computing,” Proc. Second Int’l Workshop no. 2, 1997.
Peer-to-Peer Systems (IPTPS), 2003. [36] T. Wu, M. Malkin, and D. Boneh, “Building Intrusion Tolerant
[7] http://www.ccrl-nece.de/gemss/reports.shtml, 2008. Applications,” Proc. DARPA Information Survivability Conf. and
[8] Global Information Grid, Wikipedia. Exposition (DISCEX), 2000.
[9] www.globus.org, 2008. [37] J. Wylie, M. Bakkaloglu, V. Pandurangan, M. Bigrigg, S. Oguz,
[10] J. Gray, P. Helland, P. O’Neil, and D. Shasha, “The Dangers of K. Tew, C. Williams, G. Ganger, and P. Khosla, “Selecting the
Replication and a Solution,” Proc. ACM SIGMOD, 1996. Right Data Distribution Scheme for a Survivable Storage
[11] C. Jin, Q. Chen, and S. Jamin, “INET: Internet Topology System,” Technical Report CMU-CS-01-120, Carnegie Mellon
Generator,” Technical Report CSE-TR-433-00, EECS Dept., Univ. Univ., 2000.
of Michigan, 2000. [38] L. Xiao, I. Yen, Y. Zhang, and F. Bastani, “Evaluating Dependable
[12] K. Kalpakis, K. Dasgupta, and O. Wolfson, “Optimal Placement of Distributed Storage Systems,” Proc. Int’l Conf. Parallel and
Replicas in Trees with Read, Write, and Storage Costs,” IEEE Distributed Processing Techniques and Applications (PDPTA), 2007.
Trans. Parallel and Distributed Systems, vol. 12, no. 6, 2001.
[13] O. Kariv and S.L. Hakimi, “An Algorithmic Approach to Location Manghui Tu received the PhD degree in
Problems—II: The p-medians,” SIAM J. Applied Math., vol. 37, computer science from the University of Texas
no. 3, 1979. at Dallas, in 2006. He is an assistant professor in
[14] H. Krawczyk, “Distributed Fingerprints and Secure Information the Department of Computer Science and
Dispersal,” Proc. 12th Ann. ACM Symp. Principles of Distributed Information Systems, Southern Utah University,
Computing (PODC), 1993. Cedar City. His research interests include
[15] H. Krawczyk, “Secret Sharing Made Short,” Proc. 13th Ann. Int’l information security, computer forensics, distrib-
Cryptology Conf. (Crypto), 1993. uted systems, grid computing, and ecological
[16] J. Kubitowicz, D. Bindel, Y. Chen, S. Czerwinski, P. Eaton, informatics. He is a member of the IEEE.
D. Geels, R. Gummadi, S. Rhea, H. Weatherspoon, W. Weimer,
C. Wells, and B. Zhao, “OceanStore: An Architecture for Global-
Scale Persistent Storage,” Proc. Ninth Int’l Conf. Architectural
Support for Programming Languages and Operating Systems
(ASPLOS), 2000.
[17] S. Lakshmanan, M. Ahamad, and H. Venkateswaran, “Responsive
Security for Stored Data,” IEEE Trans. Parallel and Distributed
Systems, vol. 14, no. 9, 2003.
Peng Li received the BSc and MS degrees in Bhavani Thuraisingham received the degrees
computer science from the Renmin University of from the University of Bristol and the University
China and the PhD degree in computer science of Wales. She joined the Department of Com-
from the University of Texas at Dallas. He is the puter Science, University of Texas at Dallas
chief software architect of Didiom LLC. Before (UTD), Richardson in October 2004 as a
that, he was a visiting assistant professor in the professor of computer science and the director
Department of Computer Science, Western of the Cyber Security Research Center in the
Kentucky University. His research interests Erik Jonsson School of Engineering and Com-
include database systems, database security, puter Science. Her current research interests
transaction processing, distributed and Internet include assured information sharing and trust-
computer and E-commerce. worthy semantic web, secure geospatial information management, and
security, surveillance, and privacy technologies. She is an elected fellow
I-Ling Yen received the BS degree from Tsing-Hua University, Beijing of three professional organizations: the Institute for Electrical and
and the MS and PhD degrees in computer science from the University of Electronics Engineers (IEEE), the American Association for the
Houston. She is currently a professor of computer science in the Advancement of Science (AAAS), and the British Computer Society
Department of Computer Science, University of Texas at Dallas, (BCS) for her work in data security. She received the IEEE Computer
Richardson. Her research interests include fault-tolerant computing, Society’s prestigious 1997 Technical Achievement Award for “out-
security systems and algorithms, distributed systems, Internet technol- standing and innovative contributions to secure data management.”
ogies, E-commerce, and self-stabilizing systems. She had published Prior to joining UTD, she worked for MITRE Corp. for 16 years which
more than 100 technical papers in these research areas and received included an Intergovernmental Personnel Act (IPA) at the National
many research awards from US National Science Foundation, Depart- Science Foundation as the program director for data and applications
ment of Defense, National Aeronautics and Space Administration, and security. At MITRE, she conducted research in secure data manage-
several industry companies. She has served as a program committee ment and data mining and was also the department head in Information
member for many conferences and the program chair/cochair for the and Data Management as well as a consultant to the DoD, Intelligence
IEEE Symposium on Application-Specific Software and System En- Community and the Treasury. Her work in information security and
gineering and Technology, IEEE High Assurance Systems Engineering information management has resulted in more than 80 journal articles,
Symposium, IEEE International Computer Software and Applications more then 200 refereed conference papers, and three US patents. She
Conference, and IEEE International Symposium on Autonomous is the author of eight books in data management, data mining, and data
Decentralized Systems. She is a member of the IEEE. security. She teaches courses in data security and digital forensics. She
is actively involved in promoting Math and Science for women and
underrepresented minorities and gives talks at SWE, WITI, and Career
Communications Inc. She is a fellow of the IEEE.
Latifur Khan received the BSc degree in

computer science and engineering from Bangla-
desh University of Engineering and Technology,
Dhaka, Bangladesh, in 1993 and the MS and
PhD degrees in computer science from the
University of Southern California, in 1996 and
2000, respectively. He is currently an associate
professor in the Department of Computer
Science, University of Texas at Dallas (UTD),
Richardson, where he has been teaching and
conducting research since September 2000. His research work is
supported by grants from NASA, the Air Force Office of Scientific
Research (AFOSR), US National Science Foundation (NSF), the Nokia
Research Center, Raytheon, Alcatel, and the SUN Academic Equipment
Grant program. In addition, he is the director of the state-of-the-art DBL
at UTD, UTD Data Mining/Database Laboratory, which is the primary
center of research related to data mining and image/video annotation at
UTD. His research interests include data mining, multimedia information
management, semantic web, and database systems with the primary
focus on the first three research disciplines. He has served as a
committee member in numerous prestigious conferences, symposiums,
and workshops including the ACM SIGKDD Conference on Knowledge
Discovery and Data Mining. He is currently on the editorial board of the
North Holland’s Computer Standards and Interface Journal, Elsevier
Publishing. He has published more than 90 papers in prestigious
journals and conferences. He is a member of the IEEE.
. For more information on this or any other computing topic,

please visit our Digital Library at www.computer.org/publications/dlib.

Security Grid

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Security Grid

Uploaded by

Copyright:

Available Formats

50 IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, VOL. 7, NO.

Secure Data Objects Replication in Data Grid

D ATA grid is a distributed computing architecture that

higher potential of compromising one of the replicas [37],

which can be measured locally in Hy since Pyroot has the

connected and there exists a cluster Hx 2 S3 such that

CAPxload and CAPx;i load

Latifur Khan received the BSc degree in

. For more information on this or any other computing topic,

You might also like