This action might not be possible to undo. Are you sure you want to continue?

Abhishek Parakh

*

, Subhash Kak

Computer Science Department, Oklahoma State University, Stillwater, OK 74075, USA

a r t i c l e i n f o

Article history:

Received 27 October 2008

Received in revised form 24 April 2009

Accepted 24 May 2009

Keywords:

Online data storage

Data partitioning

Implicit security architecture

a b s t r a c t

It is advantageous to use implicit security for online data storage in a cloud computing

environment. We describe the use of a data partitioning scheme for implementing such

security involving the roots of a polynomial in ﬁnite ﬁeld. The partitions are stored on ran-

domly chosen servers on the network and they need to be retrieved to recreate the original

data. Data reconstruction requires access to each server, login password and the knowledge

of the servers on which the partitions are stored. This scheme may also be used for data

security in sensor networks and internet voting protocols.

Ó 2009 Elsevier Inc. All rights reserved.

1. Introduction

Securing data stored on distributed servers is of fundamental importance in cloud computing. The traditional (explicit)

approach to securing data is to store and back it up on a single server and allow access upon the use of passwords that

are needed to be frequently changed. But there is a tendency among users to keep passwords simple and memorable

[11,21,23] leading to the possibility of brute force attacks. Furthermore since the data on the Web is archived, keys that pro-

vide adequate encryption today are likely to be broken in the future. Therefore, explicit security architecture may not be ade-

quate for many applications.

To go beyond the present approach, one may incorporate further security within the system by using puzzles [14] or

otherwise by another layer that increases the space that the intruder must search in order to break the system. Here, we

propose an implicit security architecture in which security is distributed amongst many entities. In contrast to hash-based

distributed security models for wireless sensor networks [27–29,31,12], we consider a more general method of data parti-

tioning. In this approach, stored data is partitioned into two or more pieces and stored at randomly chosen places on the

network that are known only to the owner of the data. Access to these pieces not only depends on the knowledge of a pass-

word but also on the knowledge of where the pieces are stored. The division of data is performed in such a way that the

knowledge of all the pieces is required to recreate the data and that none of the individual pieces reveals any useful infor-

mation. In scenarios where one or more pieces may be at the danger of being lost or inaccessible due to system or network

failure, one may employ schemes that can recreate the data from a subset of original pieces. Formally, our scheme consists of

two parts – the ﬁrst part is a ðk; kÞ partitioning scheme where all the k partitions are required to recreate the data and the

second part extends the ﬁrst part of the scheme to a ðk; nÞ partitioning scheme, where k 6 n and k P2, i.e. only k out of n

partitions are required to recreate the data.

A number of schemes have been proposed in the communications context for splitting and sharing of decryption keys

[25,8,22]. These schemes fall under the category of ‘‘secret sharing schemes”, where the decryption key is considered to

0020-0255/$ - see front matter Ó 2009 Elsevier Inc. All rights reserved.

doi:10.1016/j.ins.2009.05.013

* Corresponding author. Tel.: +1 405 744 5740.

E-mail addresses: parakh@cs.okstate.edu (A. Parakh), subhashk@cs.okstate.edu (S. Kak).

Information Sciences 179 (2009) 3323–3331

Contents lists available at ScienceDirect

Information Sciences

j our nal homepage: www. el sevi er . com/ l ocat e/ i ns

be a secret. Motivated by the need to have an analog of the case where several ofﬁcers must simultaneously use their keys

before a bank vault or a safe deposit box can be opened, these schemes do not consider the requirement of data protection

for a single party. Further, in any secret sharing scheme it is assumed that the encrypted data is stored in a secure place and

that none of it can be compromised without the decryption key. Some schemes that directly apply secret sharing for distrib-

uted data storage in sensor networks have been proposed [6,10,30], but have a complexity of Oðnlog

2

nÞ at best, whereas our

scheme can achieve linear complexity. Other schemes aim at sharing multiple secrets [7] but have a complexity of Oðn

2

Þ or

generate group keys for group sharing of information [17]. Certain probability models have been proposed for reconstructing

secret sharing over the Internet [18], but only focus on strategies for share distribution over the network and do not provide

methods for data partitioning.

In this paper, we therefore protect data by distributing its parts over various servers. The idea of doing these partitions is a

generalization of the use of 3 or 9 roots of a number in a cubic transformation [15]. The scheme we present is simple and

easily implementable.

We would like to stress that the presented scheme is different from Shamir’s secret sharing scheme [25] which takes the

advantage of polynomial interpolation and maps the secret on the y-axis; whereas we map the data as roots of a polynomial

on the x-axis.

A potential application of the idea of implicit security is in secure internet voting. Internet voting is an inherent part of the

online virtual worlds, such as Second Life, Active Worlds, Habbo Hotel, and others, and is only a few years away from large

scale real-world implementations [26,1–3]. These virtual worlds provide an excellent test-bed for online voting schemes,

which once successful may be used in real-world democratic process. To make use of the implicit security architecture in

online voting, the voter’s cast ballot is considered as his data and the partitions of this ballot may be stored on different serv-

ers for distributed security. Such a system will protect the intermediate election results from being revealed while distrib-

uting the security of the ballots over different servers [20].

2. Proposed data partitioning scheme

2.1. The ðk; kÞ data partitioning scheme

We consider the model of Fig. 1 to illustrate the difference between explicit and implicit security. We need to partition

data and send it to randomly chosen servers. We now describe our data partitioning scheme.

By the fundamental theorem of algebra, every equation of degree k has k roots. We use this fact to partition data into k

partitions such that each of the partition is stored on a different server. No explicit encryption of data is required to secure

each partition. The partitions in themselves do not reveal any information and hence are implicitly secure. Only when all the

partitions are brought together is the data revealed.

Consider an equation of order k

x

k

þ a

kÀ1

x

kÀ1

þ a

kÀ2

x

kÀ2

þ Á Á Á þ a

1

x þ a

0

¼ 0 ð1Þ

Eq. (1) has k roots denoted by fr

1

; r

2

; . . . ; r

k

g #fset of complex numbersg and can be rewritten as

ðx À r

1

Þðx À r

2

Þ Á Á Á ðx À r

k

Þ ¼ 0 ð2Þ

In cryptography, it is more convenient to use the ﬁnite ﬁeld Z

p

where p is a large prime. If we replace a

0

in (1) with the data

d 2 Z

p

that we wish to partition then,

x

k

þ

¸

kÀ1

i¼1

a

kÀi

x

kÀi

þ d 0 mod p ð3Þ

where 0 6 a

i

6 p À1 and 0 6 d 6 p À1. (Note that one may alternatively use Àd in (3) instead of d.) This may be rewritten as

Fig. 1. Illustration of implicit and explicit security architectures.

3324 A. Parakh, S. Kak / Information Sciences 179 (2009) 3323–3331

¸

k

i¼1

ðx À r

i

Þ 0 mod p ð4Þ

where 1 6 r

i

6 p À 1. The r

i

are the partitions (Fig. 2). It is clear that the term d in (3) is independent of variable x and

therefore

¸

k

i¼1

r

i

d mod p ð5Þ

If we allow the coefﬁcients in (3) to take values a

1

¼ a

2

¼ Á Á Á ¼ a

kÀ1

¼ 0, then (3) will have k roots only if GCDðp À1; kÞ–1 and

9b 2 Z

p

such that d is the kth power of b. One simple way to chose such a p would be to choose a prime of the form ðk Á s þ1Þ,

where s 2 N. However, such a choice would not provide good security because knowledge of the number of roots and one of

the partitions would be sufﬁcient to recreate the original data by computing the kth power of that partition. Furthermore,

not all values of d will have a kth root and hence one cannot use any arbitrary data, which would ideally be required. There-

fore, one of the restrictions on choosing the coefﬁcients is that not all of them are simultaneously zero.

For example, if the data needs to be divided into two parts then an equation of second degree is chosen and the roots

computed. If we represent this general equation by

x

2

þ a

1

x þ d 0 mod p ð6Þ

then the two roots can be calculated by solving the following equation modulo p:

x ¼

Àa

1

Æ

ﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ

a

2

1

À4d

2

ð7Þ

which has an solution in Z

p

only if the square root

ﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ

a

2

1

À4d

**exists modulo p. If the square root does not exist then a different
**

value of a

1

needs to be chosen. We present a practical way of choosing the coefﬁcients below. However, this brings out the

second restriction on the coefﬁcients, i.e. they should be so chosen such that a solution to the equation exists in Z

p

.

Theorem 1. If the coefﬁcients a

i

; 1 6 i 6 k À1 in Eq. (3) are randomly and uniformly chosen and are not all simultaneously zero,

then the knowledge of any k À1 roots of the equation, such that Eq. (4) holds, does not provide any information about the value of

d with a probability greater than that of a random guess of 1=p.

Proof. Given a speciﬁc d, the coefﬁcients in (3) can be chosen to satisfy (4) in Z

p

in the following manner. Choose at random

from the ﬁeld k À1 random roots r

1

; r

2

; . . . ; r

kÀ1

. Then kth root r

k

can be computed by solving the following equation,

r

k

¼ d Á ðr

1

Á r

2

Á Á Á Á Á r

kÀ1

Þ

À1

mod p ð8Þ

Since the roots are uniformly and randomly distributed in Z

p

, the probability of guessing r

k

without knowing the value of d is

1=p. Conversely, d cannot be estimated with a probability greater than 1=p without knowing the kth root r

k

.

It follows from Theorem 1 that data is represented as a multiple of k numbers in the ﬁnite ﬁeld. h

Fig. 2. The details of the partitioning process.

A. Parakh, S. Kak / Information Sciences 179 (2009) 3323–3331 3325

Example 1. Let data d ¼ 10, prime p ¼ 31, and let k ¼ 3. We need to partition the data into three parts for which we will

need to use a cubic equation, x

3

þa

2

x

2

þa

1

x À d 0 mod p.

We can ﬁnd the equation satisfying the required properties using Theorem 1. Assume, ðx Àr

1

Þðx Àr

2

Þðx Àr

3

Þ 0 mod 31.

We randomly choose two roots from the ﬁeld, r

1

¼ 19 and r

2

¼ 22. Therefore, r

3

d Á ðr

1

Á r

2

Þ

À1

10 Á ð19 Á 22Þ

À1

mod 31 11. The equation becomes ðx Àr

1

Þðx Àr

2

Þðx Àr

3

Þ x

3

À21x

2

þ x À10 0 mod 31, where the coefﬁcients are

a

1

¼ 1 and a

2

¼ À21 and the partitions are 11, 22 and 19.

2.2. Choosing the coefﬁcients

In the previous section, we described two conditions that must be satisﬁed by the coefﬁcients. The ﬁrst condition was that

not all the coefﬁcients are simultaneously zero and second that the choice of coefﬁcients should result in an equation with

roots in Z

p

. Since no generalized method for solving equations of degree higher than 4 exists [5], a numerical method must be

used which becomes impractical as the number of partitions grow. An easier method to compute the coefﬁcients is exem-

pliﬁed by Theorem 1 and Example 1.

One might ask why should we want to compute the coefﬁcients if we already have all the roots? We answer this question

in Section 3.

2.3. Introducing redundancy

In situations when the data pieces stored on one or more (less than a threshold number of) servers over the Internet may

not be accessible, the user should be able to recreate the data from the available pieces. The procedure outlined below extend

the k partitions to n, n Pk partitions such that only k of them need to be brought together to recreate the data. If

fr

1

; r

2

; . . . ; r

k

g is the original set of partitions then they can be mapped into a set of n partitions fp

1

; p

2

; . . . ; p

n

g by the use

of a mapping function based on linear algebra. If we construct n linearly independent equations such that

a

11

r

1

þ a

12

r

2

þ Á Á Á þ a

1k

r

k

¼ c

1

a

21

r

1

þ a

22

r

2

þ Á Á Á þ a

2k

r

k

¼ c

2

.

.

.

a

n1

r

1

þ a

n2

r

2

þ Á Á Á þ a

nk

r

k

¼ c

n

where numbers a

ij

are randomly chosen from the ﬁnite ﬁeld Z

p

, then the n new partitions are

p

i

¼ fa

i1

; a

i2

; . . . ; a

ik

; c

i

g; 1 6 i 6 n. The above linear equation can be written as matrix operation,

a

11

a

12

. . . a

1k

a

21

a

22

. . . a

2k

.

.

.

a

n1

a

n2

. . . a

nk

¸

¸

¸

¸

¸

¸

r

1

r

2

.

.

.

r

k

¸

¸

¸

¸

¸

¸

¼

c

1

c

2

.

.

.

c

n

¸

¸

¸

¸

¸

¸

To recreate r

j

; 1 6 j 6 k from the new partitions, any k of them can be brought together,

r

1

r

2

.

.

.

r

k

¸

¸

¸

¸

¸

¸

¼

a

m1

a

m2

. . . a

mk

a

n1

a

n2

. . . a

nk

.

.

.

a

i1

a

i2

. . . a

ik

¸

¸

¸

¸

¸

¸

À1

kÂk

c

1

c

2

.

.

.

c

k

¸

¸

¸

¸

¸

¸

kÂ1

A feature of the presented scheme is that new partitions may be added and deleted without affecting any of the existing

partitions.

2.4. Complexity of the proposed protocols

Following subsections discuss the complexity issues of the proposed data partitioning scheme.

2.4.1. The complexity of the ðk; kÞ data partitioning scheme

The complexity of the ðk; kÞ partitioning scheme is OðkÞ. This is illustrated by the following algorithm.

Algorithm 1a. (k,k) Data Partitioning Scheme

Input data d

1. Choose randomly and uniformly from a ﬁnite ﬁeld Z

p

; k À1 numbers r

1

; r

2

; . . . ; r

kÀ1

.

2. Compute r

k

¼ d Á ðr

1

Á r

2

Á Á Á Á Á r

kÀ1

Þ

À1

mod p.

3326 A. Parakh, S. Kak / Information Sciences 179 (2009) 3323–3331

3. Construct the kth degree polynomial: pðkÞ ¼ ðx Àr

1

Þðx À r

2

Þ Á Á Á ðx À r

k

Þ mod p ¼ x

k

þa

kÀ1

x

kÀ1

þa

kÀ2

x

kÀ2

þÁ Á Á þa

1

x þa

0

mod p.

4. The roots r

1

; r

2

; . . . ; r

k

of the polynomial pðkÞ represent the data partitions.

Output ?vector

~

R ¼ ½r

1

; r

2

; . . . ; r

k

As seen fromthe above algorithm, partition creation requires k multiplications and one inversion operation modulo prime

p. Hence the time complexity of OðkÞ.

Similarly, data reconstruction requires k multiplications which is illustrated by the following algorithm.

Algorithm 1b. (k,k) Data ReconstructionInput vector

~

R ¼ ½r

1

; r

2

; . . . ; r

k

1. Data d ¼ r

1

Á r

2

Á Á Á Á Á r

k

mod p.

Output ?d

Linear complexity of Algorithms 1a and 1b is highly desirable for resource constraint environments such as that of sensor

networks.

2.4.2. The complexity of the ðk; nÞ data partitioning scheme

The proposed ðk; nÞ data partitioning scheme uses the ðk; kÞ partitioning scheme as a sub-algorithm. That is, we ﬁrst obtain

k partitions using Algorithm 1a and then introduce redundancy to cope with possible inaccessibility of some of the parti-

tions. The ðk; nÞ partitioning scheme works as follows.

Algorithm 2a. (k,n) Data Partitioning Scheme

Input data d

1. Use Algorithm 1a to create k partitions of data d. Obtain vector

~

R ¼ ½r

1

; r

2

; . . . ; r

k

.

2. Generate n Âk matrix A such that all the rows of matrix are linearly independent.Note. A simple way to choose such a

matrix would be to choose a Vandermonde matrix of the following form

A ¼

1 x

1

x

2

1

x

3

1

. . . x

kÀ1

1

1 x

2

x

2

2

x

3

2

. . . x

kÀ1

2

.

.

.

1 x

n

x

2

n

x

3

n

. . . x

kÀ1

n

¸

¸

¸

¸

¸

¸

¸

3. Create n partitions by computing ½A

nÂk

Á ½

~

R

T

kÂ1

¼ ½C

nÂ1

, where C ¼ ½c

1

; c

2

; . . . ; c

n

T

.

4. The partitions are denoted by the pair p

i

¼ fx

i

; c

i

g, 1 6 i 6 n.

Output ! p

i

¼ fx

i

; c

i

g; 1 6 i 6 n

Algorithm 2b. (k,n) Data Reconstruction

Input any k partitions p

i

¼ fx

i

; c

i

g

1. Construct the k Âk matrix B by choosing the rows of matrix A corresponding to the given pairs p

i

and compute B

À1

.

2. Reconstruct the k shares by evaluating ½

~

R

T

kÂ1

¼ ½B

À1

kÂk

Á ½C

kÂ1

.

3. Use Algorithm 1b to reconstruct data d, where the input is the algorithm is

~

R.

Output ?d

The complexity of Algorithms 2a and 2b is Oðnlog

2

nÞ [4,16].

A more efﬁcient method, that eliminates multiplications during partition creation, would be to use a n Âk matrix with

elements from GF(2). Such matrices are used in error-correction coding [19]. This will reduce all multiplication operations

to additions operations for partition creation (assuming the cost of multiplications with 0 and 1 can be ignored). And inver-

sion of such a matrix during data reconstruction would require at most Oðn

3

Þ operations and Oðn

2

Þ multiplications. An exam-

ple of such a matrix in GF(2) is given below, where we have assumed n ¼ 4 and k ¼ 3.

A ¼

1 0 1

0 1 0

0 1 1

1 1 0

¸

¸

¸

¸

¸

A. Parakh, S. Kak / Information Sciences 179 (2009) 3323–3331 3327

Any k ¼ 3 columns the above matrix A are linearly independent. Since the above matrix only requires multiplications

with 0 and 1, the time complexity of the scheme becomes linear for partition creation. This is one of the advantages of

our scheme over traditional schemes, such as that of Shamir’s, where a matrix GF(2) cannot be used to simplify

computations.

2.4.3. A note on applications

As seen above, Algorithms 1a and 1b require OðkÞ computations while Algorithms 2a and 2b require Oðnlog

2

nÞ computa-

tions. Both these implementations are efﬁcient and suitable for resource constraint environments such as wireless sensor

networks. Further, while sensors generate crucial data and transmit it securely to the base station, the reconstruction of data

happens for most cases only at the base station. Since base stations have much higher computational capacity than the sen-

sors themselves, a matrix in GF(2) may be used to reduce the computational burden on the sensors.

3. An alternate approach to partitioning

Once the user has computed all the k roots then he may compute the equation resulting from (4) and store one or all of

the roots on different servers and the coefﬁcients on different servers. Recreation of original data can now be performed in

two ways: either using (5) or choosing one of the roots at random and retrieving the coefﬁcients and substituting the appro-

priate values in the equation (3) to compute

d

0

x

k

þ a

kÀ1

x

kÀ1

þ a

kÀ2

x

kÀ2

þ Á Á Á þ a

1

x mod p ð9Þ

Therefore, the recreated data d ðp À d

0

Þ mod p. Parenthetically, this may provide a scheme for fault tolerance. Additionally,

a user may store just one of the roots and k À1 coefﬁcients on the network. These together represent k partitions.

Note. Distinct sets of coefﬁcients (for a given constant termin the equation) result in distinct sets roots and vice versa. This

is because two distinct sets of coefﬁcients represent distinct polynomials because two polynomials are said to be equal if and

only if they have the same coefﬁcients. By the fundamental theorem of algebra, every polynomial has unique set of roots.

Two sets of roots R

1

and R

2

are distinct if and only if 9r

i

8r

j

ðr

i

–r

j

Þ, where r

i

2 R

1

and r

j

2 R

2

. To compute the correspond-

ing polynomials and the two sets coefﬁcients C

1

and C

2

, we perform

¸

k

i¼1;r

i

2R

1

ðx Àr

i

Þ 0 mod p and

¸

k

j¼1;r

j

2R

2

ðx Àr

j

Þ 0 mod p and read the coefﬁcients of from the resulting polynomials, respectively. It is clear the at least

one of the factors of the two polynomials is distinct because at least one of the roots is distinct; hence the resulting polyno-

mials for a distinct set of roots are distinct.

Theorem 2. Determining the coefﬁcients of a polynomial of degree k P2 in a ﬁnite ﬁeld Z

p

, where p is prime, by brute force,

requires Xðd

p

kÀ1

ðkÀ1Þ!

eÞ computations.

Proof. If A represents the set of coefﬁcients A ¼ fa

1

; a

2

; . . . ; a

kÀ1

g, where 0 6 a

i

6 p À1, then by the above note each distinct

instance of set A gives rise to a distinct set of roots R ¼ fr

1

; r

2

; . . . ; r

k

g, and conversely every distinct instance of set R gives rise

to a distinct set of coefﬁcients A. Therefore, if we ﬁx d to a constant then (5) can be used to compute a set of roots. Every

distinct set of roots is therefore a k À1 combination of a multiset, where each element has inﬁnite multiplicity, and equiv-

alently a k À1 combination of set S ¼ f0; 1; 2; . . . ; p À 1g with repetition allowed. Thus, the number of possibilities for the

choices of coefﬁcients is given by the following expression:

p

k À1

¼

p À1 þ ðk À1Þ

k À1

¼

ðp þ k À2Þ!

ðp À1Þ!ðk À1Þ!

¼

ðp þ k À2Þðp þ k À3Þ Á Á Á ðp þ k À kÞðp À1Þ!

ðp À1Þ!ðk À1Þ!

ð10Þ

¼

ðp þ k À2Þðp þ k À3Þ Á Á Á p

ðk À1Þ!

Pd

p

kÀ1

ðk À1Þ!

e

Here we have used the fact that in practice p ) k P2, hence the result. We have ignored the one prohibited case of all coef-

ﬁcients being zero, which has no effect on our result.

4. A stronger variation to the protocol modulo a composite number

An additional layer of security may be added to the implementation by performing computations modulo a composite

number n ¼ p Á q; p and q are primes, and using an encryption exponent to encrypt the data before computing the roots of

the equation. In such a variation, knowledge of all the roots and coefﬁcients of the equation will not reveal any information

about the data and the adversary will require to know the secret factors of n. For this we can use an equation such as the

follows:

3328 A. Parakh, S. Kak / Information Sciences 179 (2009) 3323–3331

x

k

þ

¸

kÀ1

i¼1

a

kÀi

x

kÀi

þ d

y

0 mod n ð11Þ

where y is a secretly chosen exponent and GCDðy; uðnÞÞ ¼ 1. If the coefﬁcients are chosen such that (11) has k roots then

¸

k

i¼1

r

i

d

y

mod n ð12Þ

Appropriate coefﬁcients may be chosen in a manner similar to that described in previous sections. It is clear that compromise

of all the roots and coefﬁcients will at the most reveal c ¼ d

y

mod n. In order to compute the original value of d, the adver-

sary will require the factors of n which are held secret by the user.

5. Addressing the data partitions

The previous sections consider the security of the proposed scheme when prime p and composite n are public knowledge.

However, there is nothing that compels the user to disclose the values of p and n. If we assume that p and n are secret values

then the partitions may be stored in the form of an ‘‘encrypted link list”. We deﬁne an encrypted link list as a link list in

which every pointer is in encrypted form and in order to ﬁnd out which node the present node points to, the user needs

to decrypt the pointer which can be done only if certain secret information is known.

If we assume that p and n are public values then the pointer can be so encrypted that each decryption either leads to

multiple addresses or depends on the knowledge of the factors or both. Only the authentic user will know which of the mul-

tiple addresses is to be picked.

Alternatively, one may use a random number generator and generate a random sequence of server IDs using a secret seed.

One way to generate this seed may be to ﬁnd the hash of the original data and use it to seed the generated sequence and keep

the hash secret.

6. Security by key distribution

In some cases, when there is a large amount of data, a user may ﬁnd it inefﬁcient to create data partitions and distribute

them over the network but may wish to encrypt the data and store it on a single server which he trusts and keep encryption

key secret. Almost always, encryption keys are very large random numbers and cannot be memorized. Therefore, the user

may create partitions of the key and spread them over the network. This approach may be more efﬁcient, if not more secure,

than creating data partitions of enormous amounts of data. Furthermore, the access to the servers on which the key parti-

tions are stored may be password restricted.

We stress that this scenario is different from the secret sharing scenario where multiple parties are involved since our

case involves data security for a single party.

Keys may be partitioned using the scheme presented in the previous sections where the data is replaced by the key. Alter-

natively one may use the key partitioning scheme presented in the following sub-section that uses polynomials in Galois

Field GF(2

m

), for which, given the distributed nature of the scheme, the security will be better than polynomial generaliza-

tion of power encryption transformations [13,9].

6.1. Key partitioning in galois ﬁeld 2

m

A Galois Field (2

m

) consists of polynomials of degree mÀ 1 or less, such that their coefﬁcients lie in GF(2). For example,

a

mÀ1

x

mÀ1

þa

mÀ2

x

mÀ2

þÁ Á Á þa

1

x þa

0

, where a

i

¼ 1 or 0 represents a polynomial belonging to GFð2

m

Þ and it may be written in

Fig. 3. Illustration of key partitioning.

A. Parakh, S. Kak / Information Sciences 179 (2009) 3323–3331 3329

binary representation as (a

mÀ1

a

mÀ2

. . . a

1

a

0

). Multiplication in GFð2

m

Þ is performed modulo an irreducible polynomial

gðxÞ ¼ x

m

þb

mÀ1

x

mÀ1

þÁ Á Á þb

1

x þb

0

, where b

i

2 GFð2Þ. A polynomial gðxÞ is said to be irreducible if it cannot be factored into

two or more polynomials each with coefﬁcients in GF(2) and each of degree less than m.

Therefore, a user can generate a random key and partition it into k partitions using the following procedure. He generates

k random polynomials of any random degree in GF(2

m

) and computes their product modulo the irreducible polynomial gðxÞ.

The resultant polynomial of degree m À1 with binary coefﬁcients is taken as the required key in its binary representation

and the randomly generated polynomials are the partitions (Fig. 3).

Example 2. In order to generate a random 8 bit key and create ﬁve partitions of it, the user may proceed as follows. Let

gðxÞ ¼ x

8

þ x

2

þ1 be an irreducible polynomial in GF(2

m

). He chooses ﬁve polynomials of degree m À1 or less (at random),

such as p

1

ðxÞ ¼ x

7

þ x

6

þx, p

2

ðxÞ ¼ x

4

þx

2

þx þ1; p

3

ðxÞ ¼ x

5

þx

3

; p

4

ðxÞ ¼ x

7

þx

6

þx

5

þ x and p

5

ðxÞ ¼ x

6

þx

2

þ x þ1 and

computes there product, p

1

ðxÞp

2

ðxÞp

3

ðxÞp

4

ðxÞp

5

ðxÞ kðxÞ mod gðxÞ, where kðxÞ is the ‘‘key polynomial” and the coefﬁcients

are the binary representation of the key.

kðxÞ p

1

ðxÞp

2

ðxÞp

3

ðxÞp

4

ðxÞp

5

ðxÞ x

6

þ x

4

þ x

3

þ x þ1 mod gðxÞ

Therefore, the random key generated is k ¼ 01011011 and the partitions are, p

1

¼ 11000010; p

2

¼ 00010111;

p

3

¼ 00101000; p

4

¼ 11100010; p

5

¼ 01000111.

Note that the knowledge of k À1 partitions, in the scheme presented above, leaves the kth partition completely undeter-

mined and the adversary still requires Hð2

mÀ1

Þ trials to determine the encryption key. This is because all the partitions

2 GFð2

m

) and that each one of them is independently and randomly chosen from the ﬁeld. Hence, knowledge of k À1 par-

titions does not reveal any information about the choice made for the kth partition. Further, since the kth partition can be any

random polynomial from the ﬁeld, in order to compute this polynomial by brute force would require on an average half the

number of trial of 2

m

, i.e. Hð2

mÀ1

Þ. Since the right hand side of the equation (the key) is completely dependent on the choice

of left hand side of the equation (the factors of the key), one would require Hð2

mÀ1

Þ trial to compute the key with the knowl-

edge of k À1 partitions [24].

In practice, keys are of 128–512 bits in AES, i.e. Hð2

127

Þ to Hð2

511

Þ possible trials on an average.

The above scheme can be extended to a ðk; nÞ partitioning scheme by converting the existing partitions into decimal inte-

gers and using the extension presented in Section 2.3.

7. Partition redundancy v/s no redundancy

In cases where the accessibility to the partitions is guaranteed and the data disks are sufﬁciently backed-up, partition

redundancy may lead to un-necessary wastage of storage space. This is certainly true for in-system storage where data par-

titions may be distributed over numerous sectors randomly and not contiguously stored. Also in case of network storage lar-

ger number of partitions would require larger upload time which may be avoidable.

8. Application in Internet voting protocols

Internet voting is a challenge for cryptography because of its opposite requirements of conﬁdentiality and veriﬁabil-

ity. There is the further restriction of ‘‘fairness” that the intermediate election results must be kept secret. One of the

ways to solve this problem is to use multiple layers of encryption such that the decryption key for each layer is avail-

able with a different authority. This obviously leaves open the question as to who is to be entrusted the encrypted

votes.

A more effective way to implement fairness would be to avoid encryption keys altogether and divide each cast ballot into

k or more pieces such that each authority is given one of the pieces [20]. This solves the problem of entrusting any one

authority with all the votes and if any of the authorities (less than the threshold) try to cheat by deleting some of the cast

ballots, then the votes may be recreated using the remaining partitions. Such a system implicitly provides a back-up for the

votes.

9. Conclusions and future work

We have described an implicit security architecture suited for the application of online storage. In this scheme data is

partitioned in such a way that each partition is implicitly secure and does not need to be encrypted. These partitions are

stored on different servers on the network which are known only to the user. Reconstruction of the data requires access

to each server and the knowledge as to which servers the data partitions are stored. Several variations of this scheme are

described, which include the implicit storage of encryption keys rather than the data, and where a subset of the partitions

may be brought together to recreate the data.

One can propose variations to the scheme presented in this paper where the data partitions need to be brought together

in a deﬁnite sequence. One way to accomplish it is by representing partition in the following manner:

3330 A. Parakh, S. Kak / Information Sciences 179 (2009) 3323–3331

p

1

; p

1

ðp

2

Þ; p

2

ðp

3

Þ; . . ., where p

i

ðp

j

Þ represents the encryption of p

j

by means of p

i

. Such a scheme will increase the complex-

ity of the brute force decryption task for an adversary.

The proposed scheme is also of potential use in sensor networks. To protect sensitive information on sensors that are

physically compromised, one may partition the data and store these parts on other sensors in the network. The key that

is used to ﬁnd the location of these parts is encrypted and this encryption is changed from time to time. This provides solu-

tions that are more general than other distributed security models used in sensor networks [27]. Our approach leads to inter-

esting research issues such as the optimal way to distribute the parts in a network of n sensors, so that the network can work

even if some of the sensors happen to malfunction by themselves or as a consequence of intervention by an adversary. It also

leads to the question of reliability and tolerance to faults in such a system.

Acknowledgement

This work was partly funded by Central Rural Electric Corporation, Stillwater, Oklahoma.

References

[1] Reuters – second life. <http://secondlife.reuters.com/stories/2008/05/07/eve-online-experiments-with-virtual-democracy/>.

[2] Second life herald. <http://www.secondlifeherald.com>.

[3] Expanding the Use of Electronic Voting Technology for UOCAVA Citizens, Department of Defence, May 2007.

[4] A. Aho, J. Hopcroft, J. Ullman, The Design and Analysis of Computer Algorithms, Addison-Wesley, 1974.

[5] A. Bharucha-Reid, M. Sambandham, Random Polynomials, Academic Press, New York, 1986.

[6] T. Claveirole, M. de Amorim, M. Abdalla, Y. Viniotis, Securing wireless sensor networks against aggregator compromises, Communications Magazine

IEEE 46 (4) (2008) 134–141.

[7] M.H. Dehkordi, S. Mashhadi, New efﬁcient and practical veriﬁable multi-secret sharing schemes, Information Sciences 178 (9) (2008) 2262–2274.

[8] D. Denning, Cryptography and Data Security, Addison-Wesley Publishing Company, 1982.

[9] L. Dickson, Linear Groups with an Exposition of the Galois Field Theory, Dover Publications, 1958.

[10] A. Eldin, A. Ghalwash, H. ElShandidy, Enhancing packet forwarding in mobile ad hoc networks by exploiting the information dispersal algorithm,

Communications, Computers and Applications, 2008. MIC-CCA 2008. Mosharaka International Conference, August 2008, pp. 20–25.

[11] E. Gheringer, Choosing passwords: security and human factors, Proceedings of IEEE 90 (2002) 369–373.

[12] B. Greenstein, D. Estrin, R. Govindan, S. Ratnasamy, S. Shenker, Difs: a distributed index for features in sensor networks, Ad Hoc Networks 1 (2003)

333–349.

[13] S. Kak, Exponentiation modulo a polynomial for data security, International Journal of Computer and Information Science 13 (1983) 337–346.

[14] S. Kak, On the method of puzzles for key distribution, International Journal of Computer and Information Science 14 (1984) 103–109.

[15] S. Kak, A cubic public-key transformation, Circuits, Systems and Signal Processing 26 (2007) 353–359.

[16] D. Knuth, The Art of Computer Programming, vol. 2, Addison-Wesley, 1969.

[17] C. Kuang-Hui, C. Wen-Tsuen, A new group key generating model for group sharing, Information Sciences 64 (1-2) (1992) 83–94.

[18] C.-Y. Lee, Y.-S. Yeh, D.-J. Chen, K.-L. Ku, A probability model for reconstructing secret sharing under the internet environment, Information Sciences 116

(2-4) (1999) 109–127.

[19] T. Moon, Error Correction Coding: Mathematical Methods and Algorithms, Wiley-Interscience, 2005.

[20] A. Parakh, S. Kak, Internet voting protocol based on implicit data security, in: Proceedings of 17th International Conference on Computer

Communications and Networks 2008, ICCN 08, pp. 1-4.

[21] R. Proctor, M. Lien, K. Vu, G. Salvendy, Improving computer security for authentication of users: inﬂuence of proactive password restrictions, Behavior

Research Methods, Instruments and Computers 34 (2002) 163–169.

[22] A. Renvall, C. Ding, A nonlinear secret sharing scheme, Information Security and Privacy, LNCS 1172 (1996) 56–66.

[23] S. Riley, Password security: what users know and what they actually do, Usability News 8.1 (2006).

[24] K. Rosen, Discrete Mathematics and its Applications, McGraw-Hill, 2007.

[25] A. Shamir, How to share a secret, Communication of ACM 22 (11) (1979) 612–613.

[26] R. Sinnott, First report-december 2004 (irish commission on electronic voting) appendix 2c, 2004, pp. 153–191.

[27] N. Subramanian, C. Yang, W. Zhang, Securing distributed data storage and retrieval in sensor networks, Pervasive and Mobile Computing 3 (6) (2007)

659–676. December.

[28] G. Wang, W. Zhang, G. Cao, T.L. Porta, On supporting distributed collaboration in sensor networks, in: IEEE Military Communications Conference,

October 2003.

[29] F. Ye, H. Luo, J. Cheng, S. Lu, L. Zhang, A two-tier data dissemination model for large-scale wireless sensor networks, in: ACM International Conference

on Mobile Computing and Networking, 2002, pp. 148–159.

[30] T. Yuan, S. Zhang, Secure fault tolerance in wireless sensor networks, in: CITWORKSHOPS’08: Proceedings of the 2008 IEEE Eighth International

Conference on Computer and Information Technology Workshops, 2008, pp. 477–482.

[31] W. Zhang, G. Cao, T.L. Porta, Data dissemination with ring-based index for sensor networks, in: IEEE International Conference on Network Protocol,

November 2003.

A. Parakh, S. Kak / Information Sciences 179 (2009) 3323–3331 3331

10. .3324 A. 1 to illustrate the difference between explicit and implicit security. To make use of the implicit security architecture in online voting. (Note that one may alternatively use Àd in (3) instead of d. These virtual worlds provide an excellent test-bed for online voting schemes. The ðk. Some schemes that directly apply secret sharing for distrib2 uted data storage in sensor networks have been proposed [6. we therefore protect data by distributing its parts over various servers. The idea of doing these partitions is a generalization of the use of 3 or 9 roots of a number in a cubic transformation [15]. 1. . every equation of degree k has k roots. these schemes do not consider the requirement of data protection for a single party.30]. Illustration of implicit and explicit security architectures. The scheme we present is simple and easily implementable. Motivated by the need to have an analog of the case where several ofﬁcers must simultaneously use their keys before a bank vault or a safe deposit box can be opened. . We would like to stress that the presented scheme is different from Shamir’s secret sharing scheme [25] which takes the advantage of polynomial interpolation and maps the secret on the y-axis. Consider an equation of order k xk þ akÀ1 xkÀ1 þ akÀ2 xkÀ2 þ Á Á Á þ a1 x þ a0 ¼ 0 Eq. Parakh. Proposed data partitioning scheme 2. whereas our scheme can achieve linear complexity. in any secret sharing scheme it is assumed that the encrypted data is stored in a secure place and that none of it can be compromised without the decryption key.) This may be rewritten as Fig. which once successful may be used in real-world democratic process. but only focus on strategies for share distribution over the network and do not provide methods for data partitioning. Certain probability models have been proposed for reconstructing secret sharing over the Internet [18]. Habbo Hotel. whereas we map the data as roots of a polynomial on the x-axis. Further. We now describe our data partitioning scheme. it is more convenient to use the ﬁnite ﬁeld Zp where p is a large prime. In this paper.1–3]. xk þ kÀ1 X i¼1 akÀi xkÀi þ d 0 mod p ð3Þ where 0 6 ai 6 p À 1 and 0 6 d 6 p À 1. and others. S. but have a complexity of Oðnlog nÞ at best. (1) has k roots denoted by fr1 . the voter’s cast ballot is considered as his data and the partitions of this ballot may be stored on different servers for distributed security. Other schemes aim at sharing multiple secrets [7] but have a complexity of Oðn2 Þ or generate group keys for group sharing of information [17]. 2. Active Worlds. . By the fundamental theorem of algebra. Internet voting is an inherent part of the online virtual worlds. rk g # fset of complex numbersg and can be rewritten as ð1Þ ðx À r 1 Þðx À r 2 Þ Á Á Á ðx À r k Þ ¼ 0 ð2Þ In cryptography. We use this fact to partition data into k partitions such that each of the partition is stored on a different server. Such a system will protect the intermediate election results from being revealed while distributing the security of the ballots over different servers [20]. kÞ data partitioning scheme We consider the model of Fig. No explicit encryption of data is required to secure each partition. The partitions in themselves do not reveal any information and hence are implicitly secure. If we replace a0 in (1) with the data d 2 Zp that we wish to partition then.1. . Only when all the partitions are brought together is the data revealed. such as Second Life. We need to partition data and send it to randomly chosen servers. Kak / Information Sciences 179 (2009) 3323–3331 be a secret. A potential application of the idea of implicit security is in secure internet voting. r2 . and is only a few years away from large scale real-world implementations [26.

the probability of guessing r k without knowing the value of d is 1=p. Furthermore. 2).A. We present a practical way of choosing the coefﬁcients below. this brings out the second restriction on the coefﬁcients. If the coefﬁcients ai . For example. One simple way to chose such a p would be to choose a prime of the form ðk Á s þ 1Þ. not all values of d will have a kth root and hence one cannot use any arbitrary data. i. if the data needs to be divided into two parts then an equation of second degree is chosen and the roots computed. Given a speciﬁc d. Therefore. The details of the partitioning process. Parakh. (4) holds. If the square root does not exist then a different 1 value of a1 needs to be chosen.e. d cannot be estimated with a probability greater than 1=p without knowing the kth root rk . rk ¼ d Á ðr 1 Á r2 Á Á Á Á Á r kÀ1 ÞÀ1 mod p ð8Þ Since the roots are uniformly and randomly distributed in Zp . Then kth root r k can be computed by solving the following equation. such that Eq. r2 . . where s 2 N. . such a choice would not provide good security because knowledge of the number of roots and one of the partitions would be sufﬁcient to recreate the original data by computing the kth power of that partition. . The r i are the partitions (Fig. S. one of the restrictions on choosing the coefﬁcients is that not all of them are simultaneously zero. Kak / Information Sciences 179 (2009) 3323–3331 k Y ðx À ri Þ 0 mod p i¼1 3325 ð4Þ where 1 6 ri 6 p À 1. It follows from Theorem 1 that data is represented as a multiple of k numbers in the ﬁnite ﬁeld. (3) are randomly and uniformly chosen and are not all simultaneously zero. 2. However. Choose at random from the ﬁeld k À 1 random roots r1 . Conversely. rkÀ1 . Theorem 1. then (3) will have k roots only if GCDðp À 1. which would ideally be required. However. Proof. 1 6 i 6 k À 1 in Eq. does not provide any information about the value of d with a probability greater than that of a random guess of 1=p. . It is clear that the term d in (3) is independent of variable x and therefore k Y i¼1 r i d mod p ð5Þ If we allow the coefﬁcients in (3) to take values a1 ¼ a2 ¼ Á Á Á ¼ akÀ1 ¼ 0. If we represent this general equation by x2 þ a1 x þ d 0 mod p then the two roots can be calculated by solving the following equation modulo p: ð6Þ x¼ Àa1 Æ qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ a2 À 4d 1 2 ð7Þ qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ which has an solution in Zp only if the square root a2 À 4d exists modulo p. they should be so chosen such that a solution to the equation exists in Zp . the coefﬁcients in (3) can be chosen to satisfy (4) in Zp in the following manner. kÞ–1 and 9b 2 Zp such that d is the kth power of b. h Fig. then the knowledge of any k À 1 roots of the equation. .

A feature of the presented scheme is that new partitions may be added and deleted without affecting any of the existing partitions. 7¼6 . 7 6 . any k of them can be brought together. . p2 . The equation becomes ðx À r 1 Þðx À r2 Þðx À r 3 Þ x3 À 21x2 þ x À 10 0 mod 31. . . . . r1 ¼ 19 and r 2 ¼ 22.k) Data Partitioning Scheme Input data d 1. (k. . Therefore. . 7 54 .3326 A. r k g is the original set of partitions then they can be mapped into a set of n partitions fp1 . Choose randomly and uniformly from a ﬁnite ﬁeld Zp . S. then the n pi ¼ fai1 . . kÞ data partitioning scheme The complexity of the ðk. kÞ partitioning scheme is OðkÞ. 6 . we described two conditions that must be satisﬁed by the coefﬁcients.. k À 1 numbers r1 . . r 3 d Á ðr1 Á r2 ÞÀ1 10 Á ð19 Á 22ÞÀ1 mod 31 11. Parakh. Introducing redundancy In situations when the data pieces stored on one or more (less than a threshold number of) servers over the Internet may not be accessible. ðx À r1 Þðx À r 2 Þðx À r 3 Þ 0 mod 31. . 2. 4 . One might ask why should we want to compute the coefﬁcients if we already have all the roots? We answer this question in Section 3. . 7 76 . an1 3 2 3 r1 c1 . Since no generalized method for solving equations of degree higher than 4 exists [5]. 2. pn g by the use of a mapping function based on linear algebra. the user should be able to recreate the data from the available pieces. 5 ck kÂ1 . . . ank rk cn 32 To recreate rj . where the coefﬁcients are a1 ¼ 1 and a2 ¼ À21 and the partitions are 11. 4 . 1 6 i 6 n.. an1 r 1 þ an2 r 2 þ Á Á Á þ ank rk ¼ cn where numbers aij are randomly chosen from the ﬁnite ﬁeld Zp . and let k ¼ 3. . This is illustrated by the following algorithm. Compute rk ¼ d Á ðr 1 Á r2 Á Á Á Á Á rkÀ1 ÞÀ1 mod p. 5 . .1. 7 6 . Algorithm 1a. 1 6 j 6 k from the new partitions. amk . 3À1 2 c1 3 6r 7 6 a 6 2 7 6 n1 6 . The above linear equation can be written as matrix operation. 7 6 . new partitions are 2 a11 a12 a22 an2 6a 6 21 6 . 7 4 . ci g. Choosing the coefﬁcients In the previous section.2. The complexity of the ðk. Assume. .4. . 2. aik . . . The ﬁrst condition was that not all the coefﬁcients are simultaneously zero and second that the choice of coefﬁcients should result in an equation with roots in Zp . . . a numerical method must be used which becomes impractical as the number of partitions grow. x3 þ a2 x2 þ a1 x À d 0 mod p. 5 4 . a1k . r kÀ1 . .4. a2k 76 r 2 7 6 c2 7 76 7 6 7 76 . If we construct n linearly independent equations such that a11 r 1 þ a12 r 2 þ Á Á Á þ a1k rk ¼ c1 a21 r 1 þ a22 r 2 þ Á Á Á þ a2k rk ¼ c2 . 2. Complexity of the proposed protocols Following subsections discuss the complexity issues of the proposed data partitioning scheme. 2 r1 3 2 am1 am2 an2 ai2 . We randomly choose two roots from the ﬁeld. If fr1 . An easier method to compute the coefﬁcients is exempliﬁed by Theorem 1 and Example 1. We can ﬁnd the equation satisfying the required properties using Theorem 1. . 7 ¼ 6 . . Kak / Information Sciences 179 (2009) 3323–3331 Example 1.. 6 . 2. rk ai1 ank 7 7 7 7 5 aik kÂk 6c 7 6 27 6 . prime p ¼ 31. . . r2 . r2 . The procedure outlined below extend the k partitions to n. 22 and 19. Let data d ¼ 10. n P k partitions such that only k of them need to be brought together to recreate the data.3. We need to partition the data into three parts for which we will need to use a cubic equation. 5 4 . ai2 .. .

Algorithm 1b. . x2 7 7 7 7 5 kÀ1 . . . rk R 2 6 6 1 x2 6 A¼6. Construct the k Â k matrix B by choosing the rows of matrix A corresponding to the given pairs pi and compute BÀ1 . Data d ¼ r 1 Á r 2 Á Á Á Á Á r k mod p. Parakh. Obtain vector ~ ¼ ½r 1 . Construct the kth degree polynomial: pðkÞ ¼ ðx À r1 Þðx À r2 Þ Á Á Á ðx À rk Þ mod p ¼ xk þ akÀ1 xkÀ1 þ akÀ2 xkÀ2 þ Á Á Á þ a1 x þ a0 mod p. that eliminates multiplications during partition creation. . 1 6 i 6 n.k) Data ReconstructionInput 1. . . . Output ? d Linear complexity of Algorithms 1a and 1b is highly desirable for resource constraint environments such as that of sensor networks. cn T . c2 . data reconstruction requires k multiplications which is illustrated by the following algorithm. .16]. ci g. (k. 6. Output ? d The complexity of Algorithms 2a and 2b is Oðnlog nÞ [4.n) Data Partitioning Scheme Input data d R 1. Output ? vector ~ ¼ ½r 1 . xn 3. This will reduce all multiplication operations to additions operations for partition creation (assuming the cost of multiplications with 0 and 1 can be ignored). Similarly. would be to use a n Â k matrix with elements from GF(2). . 1 6 i 6 n Algorithm 2b. S. A simple way to choose such a matrix would be to choose a Vandermonde matrix of the following form vector ~ ¼ ½r1 . . nÞ data partitioning scheme The proposed ðk. . Output ! pi ¼ fxi . r 2 .2. where we have assumed n ¼ 4 and k ¼ 3. ci g. Use Algorithm 1a to create k partitions of data d. . Reconstruct the k shares by evaluating ½~T kÂ1 ¼ ½BÀ1 kÂk Á ½CkÂ1 . rk R As seen from the above algorithm. 2. R 4. Kak / Information Sciences 179 (2009) 3323–3331 3327 3. . Use Algorithm 1b to reconstruct data d. 2. 2. (k.Note. . ci g 1. where C ¼ ½c1 . r2 . 2 3 1 0 1 60 1 07 6 7 A¼6 7 40 1 15 1 1 0 2 . The roots r1 . (k. . rk . . . 4. r 2 . . . Hence the time complexity of OðkÞ. Generate n Â k matrix A such that all the rows of matrix are linearly independent. . Create n partitions by computing ½AnÂk Á ½~T kÂ1 ¼ ½CnÂ1 .4. Algorithm 2a. . The ðk.n) Data Reconstruction Input any k partitions pi ¼ fxi . 4. r k of the polynomial pðkÞ represent the data partitions. A more efﬁcient method. r2 .A. The complexity of the ðk. R 3. The partitions are denoted by the pair pi ¼ fxi . partition creation requires k multiplications and one inversion operation modulo prime p. . That is. And inversion of such a matrix during data reconstruction would require at most Oðn3 Þ operations and Oðn2 Þ multiplications. x1 3 kÀ1 7 . we ﬁrst obtain k partitions using Algorithm 1a and then introduce redundancy to cope with possible inaccessibility of some of the partitions. . . kÞ partitioning scheme as a sub-algorithm. nÞ data partitioning scheme uses the ðk. An example of such a matrix in GF(2) is given below. 1 xn 1 x1 x2 1 x2 2 x2 n x3 1 x3 2 x3 n kÀ1 . . nÞ partitioning scheme works as follows. Such matrices are used in error-correction coding [19]. where the input is the algorithm is ~ R. .

a2 . Note. Kak / Information Sciences 179 (2009) 3323–3331 Any k ¼ 3 columns the above matrix A are linearly independent. r k g. S. while sensors generate crucial data and transmit it securely to the base station. To compute the correspondQk and ing polynomials and the two sets coefﬁcients C 1 and C 2 . . and conversely every distinct instance of set R gives rise to a distinct set of coefﬁcients A. Distinct sets of coefﬁcients (for a given constant term in the equation) result in distinct sets roots and vice versa. Theorem 2. Thus. Determining the coefﬁcients of a polynomial of degree k P 2 in a ﬁnite ﬁeld Zp . Every distinct set of roots is therefore a k À 1 combination of a multiset. An alternate approach to partitioning Once the user has computed all the k roots then he may compute the equation resulting from (4) and store one or all of the roots on different servers and the coefﬁcients on different servers. Since the above matrix only requires multiplications with 0 and 1. if we ﬁx d to a constant then (5) can be used to compute a set of roots. . . Algorithms 1a and 1b require OðkÞ computations while Algorithms 2a and 2b require Oðnlog nÞ computations. where p is prime. this may provide a scheme for fault tolerance. akÀ1 g. the recreated data d ðp À d Þ mod p. the time complexity of the scheme becomes linear for partition creation. Additionally.3328 A. Proof.r j 2R2 ðx À r j Þ 0 mod p and read the coefﬁcients of from the resulting polynomials. Both these implementations are efﬁcient and suitable for resource constraint environments such as wireless sensor networks. A note on applications 2 As seen above.r i 2R1 ðx À r i Þ 0 mod p Qk j¼1. where a matrix GF(2) cannot be used to simplify computations. A stronger variation to the protocol modulo a composite number An additional layer of security may be added to the implementation by performing computations modulo a composite number n ¼ p Á q. . Parakh. where each element has inﬁnite multiplicity. By the fundamental theorem of algebra. pkÀ1 requires XðdðkÀ1Þ!eÞ computations. such as that of Shamir’s. . . the reconstruction of data happens for most cases only at the base station. r2 . the number of possibilities for the choices of coefﬁcients is given by the following expression: p kÀ1 ¼ p À 1 þ ðk À 1Þ kÀ1 ðp þ k À 2Þ! ¼ ðp À 1Þ!ðk À 1Þ! ðp þ k À 2Þðp þ k À 3Þ Á Á Á ðp þ k À kÞðp À 1Þ! ¼ ðp À 1Þ!ðk À 1Þ! ðp þ k À 2Þðp þ k À 3Þ Á Á Á p ¼ ðk À 1Þ! pkÀ1 e ðk À 1Þ! ð10Þ Pd Here we have used the fact that in practice p ) k P 2. Therefore. Two sets of roots R1 and R2 are distinct if and only if 9r i 8rj ðr i – rj Þ. It is clear the at least one of the factors of the two polynomials is distinct because at least one of the roots is distinct. . p À 1g with repetition allowed. where 0 6 ai 6 p À 1. 1. For this we can use an equation such as the follows: . In such a variation.3. Since base stations have much higher computational capacity than the sensors themselves. every polynomial has unique set of roots. 4. respectively. p and q are primes. . we perform i¼1. and using an encryption exponent to encrypt the data before computing the roots of the equation. hence the result. and equivalently a k À 1 combination of set S ¼ f0. 2. Further. . where r i 2 R1 and rj 2 R2 . . Parenthetically. This is one of the advantages of our scheme over traditional schemes. a matrix in GF(2) may be used to reduce the computational burden on the sensors. by brute force. hence the resulting polynomials for a distinct set of roots are distinct. a user may store just one of the roots and k À 1 coefﬁcients on the network.4. . This is because two distinct sets of coefﬁcients represent distinct polynomials because two polynomials are said to be equal if and only if they have the same coefﬁcients. We have ignored the one prohibited case of all coefﬁcients being zero. These together represent k partitions. . If A represents the set of coefﬁcients A ¼ fa1 . 3. 2. knowledge of all the roots and coefﬁcients of the equation will not reveal any information about the data and the adversary will require to know the secret factors of n. which has no effect on our result. Recreation of original data can now be performed in two ways: either using (5) or choosing one of the roots at random and retrieving the coefﬁcients and substituting the appropriate values in the equation (3) to compute d xk þ akÀ1 xkÀ1 þ akÀ2 xkÀ2 þ Á Á Á þ a1 x mod p 0 0 ð9Þ Therefore. then by the above note each distinct instance of set A gives rise to a distinct set of roots R ¼ fr1 .

If we assume that p and n are secret values then the partitions may be stored in the form of an ‘‘encrypted link list”. given the distributed nature of the scheme. if not more secure. Alternatively one may use the key partitioning scheme presented in the following sub-section that uses polynomials in Galois Field GF(2m ). the security will be better than polynomial generalization of power encryption transformations [13. 6. Only the authentic user will know which of the multiple addresses is to be picked. . uðnÞÞ ¼ 1. than creating data partitions of enormous amounts of data. However. It is clear that compromise y of all the roots and coefﬁcients will at the most reveal c ¼ d mod n. such that their coefﬁcients lie in GF(2). one may use a random number generator and generate a random sequence of server IDs using a secret seed. encryption keys are very large random numbers and cannot be memorized. Kak / Information Sciences 179 (2009) 3323–3331 kÀ1 X i¼1 3329 xk þ akÀi xkÀi þ d 0 mod n y ð11Þ where y is a secretly chosen exponent and GCDðy. S. For example. Therefore. where ai ¼ 1 or 0 represents a polynomial belonging to GFð2m Þ and it may be written in Fig. Key partitioning in galois ﬁeld 2m A Galois Field (2m ) consists of polynomials of degree m À 1 or less. If we assume that p and n are public values then the pointer can be so encrypted that each decryption either leads to multiple addresses or depends on the knowledge of the factors or both. a user may ﬁnd it inefﬁcient to create data partitions and distribute them over the network but may wish to encrypt the data and store it on a single server which he trusts and keep encryption key secret. the access to the servers on which the key partitions are stored may be password restricted.A. We deﬁne an encrypted link list as a link list in which every pointer is in encrypted form and in order to ﬁnd out which node the present node points to. This approach may be more efﬁcient. Alternatively.1. One way to generate this seed may be to ﬁnd the hash of the original data and use it to seed the generated sequence and keep the hash secret. Keys may be partitioned using the scheme presented in the previous sections where the data is replaced by the key. when there is a large amount of data. Furthermore.9]. If the coefﬁcients are chosen such that (11) has k roots then k Y i¼1 r i d mod n y ð12Þ Appropriate coefﬁcients may be chosen in a manner similar to that described in previous sections. Almost always. 5. Addressing the data partitions The previous sections consider the security of the proposed scheme when prime p and composite n are public knowledge. 6. In order to compute the original value of d. 3. the user needs to decrypt the pointer which can be done only if certain secret information is known. there is nothing that compels the user to disclose the values of p and n. for which. amÀ1 xmÀ1 þ amÀ2 xmÀ2 þ Á Á Á þ a1 x þ a0 . Illustration of key partitioning. Parakh. the adversary will require the factors of n which are held secret by the user. the user may create partitions of the key and spread them over the network. Security by key distribution In some cases. We stress that this scenario is different from the secret sharing scenario where multiple parties are involved since our case involves data security for a single party.

Partition redundancy v/s no redundancy In cases where the accessibility to the partitions is guaranteed and the data disks are sufﬁciently backed-up. Hð2127 Þ to Hð2511 Þ possible trials on an average. Example 2. Multiplication in GFð2m Þ is performed modulo an irreducible polynomial gðxÞ ¼ xm þ bmÀ1 xmÀ1 þ Á Á Á þ b1 x þ b0 . 7. one would require Hð2mÀ1 Þ trial to compute the key with the knowledge of k À 1 partitions [24]. which include the implicit storage of encryption keys rather than the data. the random key generated is k ¼ 01011011 and the partitions are.3. One of the ways to solve this problem is to use multiple layers of encryption such that the decryption key for each layer is available with a different authority. p3 ðxÞ ¼ x5 þ x3 . This is certainly true for in-system storage where data partitions may be distributed over numerous sectors randomly and not contiguously stored. 3). In this scheme data is partitioned in such a way that each partition is implicitly secure and does not need to be encrypted. The resultant polynomial of degree m À 1 with binary coefﬁcients is taken as the required key in its binary representation and the randomly generated polynomials are the partitions (Fig. These partitions are stored on different servers on the network which are known only to the user. There is the further restriction of ‘‘fairness” that the intermediate election results must be kept secret. since the kth partition can be any random polynomial from the ﬁeld. This solves the problem of entrusting any one authority with all the votes and if any of the authorities (less than the threshold) try to cheat by deleting some of the cast ballots. partition redundancy may lead to un-necessary wastage of storage space. in the scheme presented above. p4 ¼ 11100010. Hð2mÀ1 Þ. Let gðxÞ ¼ x8 þ x2 þ 1 be an irreducible polynomial in GF(2m ). Parakh. such as p1 ðxÞ ¼ x7 þ x6 þ x. A polynomial gðxÞ is said to be irreducible if it cannot be factored into two or more polynomials each with coefﬁcients in GF(2) and each of degree less than m.e.3330 A. nÞ partitioning scheme by converting the existing partitions into decimal integers and using the extension presented in Section 2. Several variations of this scheme are described. Reconstruction of the data requires access to each server and the knowledge as to which servers the data partitions are stored. One way to accomplish it is by representing partition in the following manner: . 9. and where a subset of the partitions may be brought together to recreate the data. a1 a0 ). the user may proceed as follows. Conclusions and future work We have described an implicit security architecture suited for the application of online storage. . i. Also in case of network storage larger number of partitions would require larger upload time which may be avoidable. In practice. This is because all the partitions 2 GFð2m ) and that each one of them is independently and randomly chosen from the ﬁeld. . p1 ðxÞp2 ðxÞp3 ðxÞp4 ðxÞp5 ðxÞ kðxÞ mod gðxÞ. p2 ¼ 00010111. p5 ¼ 01000111. knowledge of k À 1 partitions does not reveal any information about the choice made for the kth partition. This obviously leaves open the question as to who is to be entrusted the encrypted votes. In order to generate a random 8 bit key and create ﬁve partitions of it. in order to compute this polynomial by brute force would require on an average half the number of trial of 2m . p4 ðxÞ ¼ x7 þ x6 þ x5 þ x and p5 ðxÞ ¼ x6 þ x2 þ x þ 1 and computes there product. keys are of 128–512 bits in AES. He generates k random polynomials of any random degree in GF(2m ) and computes their product modulo the irreducible polynomial gðxÞ. Such a system implicitly provides a back-up for the votes. kðxÞ p1 ðxÞp2 ðxÞp3 ðxÞp4 ðxÞp5 ðxÞ x6 þ x4 þ x3 þ x þ 1 mod gðxÞ Therefore. p2 ðxÞ ¼ x4 þ x2 þ x þ 1. Since the right hand side of the equation (the key) is completely dependent on the choice of left hand side of the equation (the factors of the key). a user can generate a random key and partition it into k partitions using the following procedure. Hence. p1 ¼ 11000010. leaves the kth partition completely undetermined and the adversary still requires Hð2mÀ1 Þ trials to determine the encryption key. then the votes may be recreated using the remaining partitions. One can propose variations to the scheme presented in this paper where the data partitions need to be brought together in a deﬁnite sequence. Therefore. where bi 2 GFð2Þ. Application in Internet voting protocols Internet voting is a challenge for cryptography because of its opposite requirements of conﬁdentiality and veriﬁability. S.e. The above scheme can be extended to a ðk. Kak / Information Sciences 179 (2009) 3323–3331 binary representation as (amÀ1 amÀ2 . 8. i. Note that the knowledge of k À 1 partitions. He chooses ﬁve polynomials of degree m À 1 or less (at random). Further. where kðxÞ is the ‘‘key polynomial” and the coefﬁcients are the binary representation of the key. p3 ¼ 00101000. A more effective way to implement fairness would be to avoid encryption keys altogether and divide each cast ballot into k or more pieces such that each authority is given one of the pieces [20].

Shamir. Usability News 8. Proceedings of IEEE 90 (2002) 369–373. Internet voting protocol based on implicit data security. A. References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] Reuters – second life. Proctor. M. International Journal of Computer and Information Science 14 (1984) 103–109. Eldin. 2007. Department of Defence. The Art of Computer Programming. M. T. Communications. International Journal of Computer and Information Science 13 (1983) 337–346. Securing wireless sensor networks against aggregator compromises. Enhancing packet forwarding in mobile ad hoc networks by exploiting the information dispersal algorithm. vol. 153–191.. M. Knuth. The Design and Analysis of Computer Algorithms. Random Polynomials. in: CITWORKSHOPS’08: Proceedings of the 2008 IEEE Eighth International Conference on Computer and Information Technology Workshops. Error Correction Coding: Mathematical Methods and Algorithms. To protect sensitive information on sensors that are physically compromised. A probability model for reconstructing secret sharing under the internet environment. <http://www. Ghalwash. Cao. pp. C. Pervasive and Mobile Computing 3 (6) (2007) 659–676. so that the network can work even if some of the sensors happen to malfunction by themselves or as a consequence of intervention by an adversary. 477–482. Kak. E. Chen. M. F. Our approach leads to interesting research issues such as the optimal way to distribute the parts in a network of n sensors. Kak. Hopcroft. . Behavior Research Methods. Parakh. The proposed scheme is also of potential use in sensor networks. pp. N. Vu.reuters. Instruments and Computers 34 (2002) 163–169. <http://secondlife. Salvendy. S. Abdalla. Discrete Mathematics and its Applications. Secure fault tolerance in wireless sensor networks. D. A. R. W. Yang. p2 ðp3 Þ. Kak / Information Sciences 179 (2009) 3323–3331 3331 p1 . Academic Press. J. pp. S. Shenker. . August 2008. Dehkordi. ElShandidy. A nonlinear secret sharing scheme. Communications Magazine IEEE 46 (4) (2008) 134–141.-S. M. T. G. Parakh. Cheng. Addison-Wesley. T. 1982. Luo. December. Lien. K. 148–159. Lee. Sambandham. Yeh. R. Information Sciences 178 (9) (2008) 2262–2274. MIC-CCA 2008. October 2003. Ye. Ding. Denning. A. 2008. Subramanian. G. Exponentiation modulo a polynomial for data security. Cryptography and Data Security. Zhang. K. Kak. How to share a secret. S. 1974.-Y. 2. A. Oklahoma. A two-tier data dissemination model for large-scale wireless sensor networks.com>. Estrin. L. where pi ðpj Þ represents the encryption of pj by means of pi . ICCN 08. S. Zhang. Improving computer security for authentication of users: inﬂuence of proactive password restrictions. W. Securing distributed data storage and retrieval in sensor networks. Zhang. W. 1986. Stillwater. Zhang. Greenstein.1 (2006). D. R. Ad Hoc Networks 1 (2003) 333–349. Data dissemination with ring-based index for sensor networks. On the method of puzzles for key distribution. Mashhadi. Wang. Wiley-Interscience. S. Bharucha-Reid. Expanding the Use of Electronic Voting Technology for UOCAVA Citizens. . A. S. Dickson. 1-4. G. A. D. Second life herald.-J. Moon. This provides solutions that are more general than other distributed security models used in sensor networks [27]. First report-december 2004 (irish commission on electronic voting) appendix 2c. T. Sinnott.H. S. pp. Rosen. Govindan. 20–25. A new group key generating model for group sharing. C. November 2003. S. Renvall. S. Gheringer.L. 2008.com/stories/2008/05/07/eve-online-experiments-with-virtual-democracy/>. Choosing passwords: security and human factors. G. 2005. in: ACM International Conference on Mobile Computing and Networking. Communication of ACM 22 (11) (1979) 612–613. New efﬁcient and practical veriﬁable multi-secret sharing schemes. Claveirole. Password security: what users know and what they actually do. S. D. T. New York. LNCS 1172 (1996) 56–66. The key that is used to ﬁnd the location of these parts is encrypted and this encryption is changed from time to time. H.L. Difs: a distributed index for features in sensor networks. one may partition the data and store these parts on other sensors in the network. Porta. Computers and Applications. C. Lu. Information Security and Privacy. Mosharaka International Conference. in: Proceedings of 17th International Conference on Computer Communications and Networks 2008. Kak. Acknowledgement This work was partly funded by Central Rural Electric Corporation. Addison-Wesley Publishing Company. Zhang. Kuang-Hui. McGraw-Hill. A cubic public-key transformation. Systems and Signal Processing 26 (2007) 353–359.secondlifeherald. On supporting distributed collaboration in sensor networks. K. Such a scheme will increase the complexity of the brute force decryption task for an adversary. Ku. Y. J. It also leads to the question of reliability and tolerance to faults in such a system. Cao. in: IEEE Military Communications Conference. . L. A. Aho. Viniotis. 2004. Dover Publications. Information Sciences 116 (2-4) (1999) 109–127. May 2007. H. Riley. 1969. Y. B. Linear Groups with an Exposition of the Galois Field Theory. p1 ðp2 Þ. C. Addison-Wesley. J. pp. S. Ratnasamy. Yuan. Ullman. in: IEEE International Conference on Network Protocol. Porta. Wen-Tsuen. de Amorim. 2002.A. 1958.-L. Circuits. Information Sciences 64 (1-2) (1992) 83–94. C.

Sign up to vote on this title

UsefulNot useful- 5_490524096002523138 (1).docx
- 05640694
- 333 Integration2010 Proceedings
- FoCM
- Data encryption using LSB matching algorithm and Reserving Room before Encryption
- Reversible Data Hiding in Encrypted Images for Large Data Size Using DCT
- hal 9 tesis.docx
- caaaa
- PS129 Encryption
- Conv Codes 05
- Uniform Approximation of a Circle by a Parametric Polynomial Curve 2016 Computer Aided Geometric Design
- [IJCST-V3I4P25]
- A Novel Approach to Separable Reversible Data Hiding
- Smarandache–Boolean–Near–Rings and Algorithms
- Discriminant
- Scilab Tutorial
- Fully Homomorphic Encryption Using Ideal Lattices
- Second-Order Partial Derivatives
- easy-fhe
- Generating Functions - Micheal Pawliuk - Canada 2015
- Ensuring Security to the Compressed Sensing Data Using a Steganographic Approach
- [10]
- Zeros of Polynomials with Restricted Coefficients
- Steganography - Hiding Messages in the Noise of a Picture
- NetApp Encryption of Networked Storage
- 12-2
- Real Plane Algebraic Curves
- Zhegalkin Polynomial
- Global Heuristic Search on Encrypted Data (GHSED),, IJCSI, Volume 2, August 2009.
- 10.1.1.3
- Ins Implicit