You are on page 1of 7

Token-free bounded delay codes and hash iteration

Sebastian Codrin Dit u


Faculty of Computer Science
Alexandru Ioan Cuza University
Iasi, Romania
sebastian.ditu@info.uaic.ro
AbstractA hash iteration technique takes as input a hash
compression function which works on xed length binary strings
and outputs a hash function which works on arbitrary length
binary strings. In this paper we introduce token-free bounded de-
lay codes and then use them to dene a hash iteration technique.
The newly created schema, when applied to a hash compression
function, preserves the following security properties: preimage-
resistance (Pre), always preimage-resistance (aPre), everywhere
preimage-resistance (ePre) and collision-resistance (Coll). The
proofs for the preservation of the second-preimage resistance
(Sec), always second-preimage resistance (aSec), and everywhere
second-preimage resistance (eSec) are part of our future work.
Comparisons with other iteration techniques are also provided.
I. INTRODUCTION AND PRELIMINARIES
Cryptographic hash functions represent a very powerful data
authentication and integrity ensurance tool, especially when
there is a large amounts of information involved. Although
they can be used as simple hash functions (e.g. ngerprints,
checksums), in the information security context they ensure
authentication, integrity and non-repudiation. In order to en-
sure the processing of an input of arbitrary length, almost all
existent hash functions rely on iteration techniques. A hash
iteration technique takes as input a hash compression function
which works on xed length binary strings and outputs a
hash function which works on arbitrary length binary strings.
The problem with the iteration techniques is that consistence
must be provided. Mainly, if the initial compression hash
function achieves some security properties, then the resulted
hash function after iteration needs to achieve those properties
as well.
a) Contributions: We contribute to the topic by investi-
gating how codes with bounded delay can support the security
of the cryptographic hash functions. We introduce a new
iteration technique which follows the general Merkle-Damg ard
model and is based on token-free bounded delay codes. Our
results show that the newly introduced technique preserves
the following security properties: Pre, ePre, aPre, Coll. The
concept of the token-free bounded delay codes seems to be
new, dened by us. The problem of the existence of this type
of codes is also investigated in [1] where we show that the
class exists and that it is decidable whether a code is token
free or not. Proving that our technique preserves Sec, eSec,
aSec as well is part of our future work plans.
b) Paper organization: Section 2 introduces the reader
into the area of interest by presenting a general iteration
technique model and then the existent hash security properties
[2]. In Section 3 we dene the token-free bounded delay codes
and then we introduce the token-free bounded delay iteration
technique together with its security properties preservation
proofs. Last but not least, we make a comparison between the
token-free and another existent iteration techniques regarding
the preservation of the named security properties. The last
section concludes our work.
c) Preliminaries: By M
$
S we understand the random
extraction of an element (referencing it as M) from the set
S. M = M
1
M
m
0, 1
m
is an m-bit string and if
1 a b m, we write M[a..b] or M
a
| |M
b
for
M
a
M
b
and we write [M[
b
for the last b bits of M.
A hash-function family is a function H : / / }
where / and } are nite nonempty sets and / and }
are sets of strings. Often we will write the rst argument
to H as a subscript, so that H
K
(M) = H(K, M) for all
M /. Time
H,m
means the minimum, over all programs
P
H
, that compute H, of the length of P
H
over all inputs
(K, M) where K / and M 0, 1
m
; plus the minimum,
over all programs P
K
that sample from /, of the time to
compute the sample plus the size of P
K
. An adversary is
an algorithm that takes any number of inputs. Some of these
inputs may be long strings and so as a convention the adversary
can read the ith bit of argument j in constant time. If A
is an adversary and Adv
xxx
H
(A) is a measure of adversarial
advantage already dened then we write Adv
xxx
H
(1) to mean
the maximal value of Adv
xxx
H
(A) over all adversaries A that
use resources bounded by 1. We use the same meaning for
Time
H,m
and an adversary as in [2]. For both Time
H,m
and adversarial measure, some underlying RAM model of
computation must be xed. By ,x|, we understand the lowest
integer higher than x.
II. ITERATION TECHNIQUES FOR HASH FUNCTIONS AND
SECURITY PROPERTIES
Introduced by Merkle and, independently, by Damg ard [2],
hash iteration techniques aim at producing hash functions
starting from hash compression functions (which compress
only xed length binary strings). Usually, a hash iteration
technique preprocesses the input message (called the prepro-
cessing phrase) and then breaks it up into blocks of xed
size and iterates over them with the compression (or round)
function. The preprocessing phase together with the way
the compression function processes iteratively every message
block dene the iteration technique. So, in an iterative process
Scheme Coll Sec aSec eSec Pre aPre ePre
SMD Y N N N N N Y
Linear N N N N N N Y
XOR-Linear Y N N Y N N Y
Shoups Y N N Y N N Y
Prex-free MD N N N N N N Y
Randomized Y N N N N N Y
HAIFA Y N N N N N Y
Enveloped MD Y N N N N N Y
SMT Y N N N N N Y
Tree Hash N N N N N N Y
XOR Tree ? ? N ? Y N Y
ROX Y Y Y Y Y Y Y
Token-free Y ? ? ? Y Y Y
TABLE I: Overview of some popular iterations and properties they
preserve. The symbol Y means that the notion is
provably preserved ; N means that it is not preserved,
in the sense that there either exists a counterexample,
either a proof supporting that; ? means that neither
proof nor counterexample are known [3]
the output of the resulted hash function depends on each
intermediate round function result. The Strengthened Merkle-
Damg ard (SMD) technique can be seen as proof of concept for
the construction model introduced by Merkle and Damg ard.
Many iterative hash functions are based on this iteration
technique (i.e. SHA-1).
input : h : 0, 1
n+b
0, 1
n
compression
function
M /, a nite-arbitrary-length message,
K /, the hask-key
output: H

, the digest message


1 x = ,([M[ + 2)/b| b;
2 m
1
| |m
L
M|100
x|M|2
|[M[
b
;
3 y
0
IV ;
4 for i 1 to / do
5 y
i
h
K
(m
i
| y
i1
) ;
6 end
7 H

= y
L
;
8 return H

;
Algorithm 1: SMD iteration technique
Any hash iteration technique must preserve the security
properties of the function it iterates over. We review the
seven existing security notions [3]: the standard three of
collision-resistance (Coll), preimage-resistance (Pre) and sec-
ond preimage-resistance (Sec); and the always and everywhere
variants of the last two of the three.
Denition 1. Let H = / / } be a hash-function
family and 0, 1
m
/. Let A be an adversary. Then:
Adv
Pre[m]
H
(A) = Pr

K
$
/; M
$
0, 1
m
;
Y H
K
(M); M

$
A(K, Y )
: H
K
(M

) = Y

(1)
Adv
ePre[m]
H
(A) =
max
Y }

Pr

K
$
/; M
$
A(K)
: H
K
(M) = Y

(2)
Adv
aPre[m]
H
(A) =
max
K /

Pr

M
$
0, 1
m
;
Y H
K
(M);
M

$
A(Y ) : H
K
(M

) = Y

(3)
While the preimage resistance denes the usual one-way
functions, the everywhere-preimage resistance states the fact
that for whatever range point is selected, it is computationally
hard to nd its preimage. Also the always-preimage resistance
consolidates Pre by saying that a function like SHA1 is
one-way: one could consider SHA-1 being part form a family
of hash functions (keyed, for example, by the initial chaining
value) and tell whether or not it remains hard to nd a
preimage of a random point for it.
Denition 2. Let H = // } be a hash-function family
and let m be a number such that 0, 1
m
/. Let A be an
adversary. Then:
Adv
Sec[m]
H
(A) = Pr

K
$
/; M
$
0, 1
m
;
M

$
A(K, M) : (M ,= M

)
(H
K
(M) = H
K
(M

))

(4)
Adv
eSec[m]
H
(A) =
max
M 0, 1
m

Pr

K
$
/;
M

$
A(K) : (M ,= M

)
(H
K
(M) = H
K
(M

))

(5)
Adv
aSec[m]
H
(A) =
max
K

Pr

M
$
0, 1
m
;
M

$
A(M) : (M ,= M

)
(H
K
(M) = H
K
(M

))

(6)
Briey, if the second-preimage resistance claims that it is
hard to nd a partner for a known and xed domain point,
the everywhere second-preimage resistance (eSec) states the
fact that it is hard for an adversary to nd a partner for
any particular domain point. The always-second preimage
resistance tells that for a function like SHA1 it remains hard
to nd a partner for a random point (again considering it as
being part from a family of hash functions).
Denition 3. Let H = // } be a hash-function family
and let A be an adversary. Then:
Adv
Coll
H
(A) = Pr

K
$
/; (M, M

)
$
A(K)
: (M ,= M

) (H
K
(M) = H
K
(M

))

(7)
The collisison resistance property denes the difculty
with which an adversary is able to nd two distinct points in
the domain of a hash function that hash to the same range
point.
The relationships between the security properties presented
above are physically represented in Figure 1. The basis for
understanding those is the difference between the conven-
tional implication and the provisional implication. Briey, if
a conventional implication is a regular one, a provisional
implication depends of some technical condition (i.e. the
compressing factor of the hash function which could tell
us that the implication strength is increasing with the hash
functions compression rate). The formal denition of the
implications, taken from [2], is as follows:
Denition 4. Fix /, /, m, and n where 0, 1
m
/.
Suppose that xxx and yyy are labels for which Adv
xxx
H
and
Adv
yyy
H
have been dened for any H : / / 0, 1
n
.
Conventional implication We say that xxx implies yyy,
written xxx yyy, if Adv
yyy
H
(t) cAdv
xxx
H
(t

) for
all hash functions H : / / 0, 1
n
where c is an
absolute constant and t

= t + cTime
H,m
;
Provisional implication We say that xxx implies yyy to ,
written xxx yyy to , if Adv
yyy
H
(t) cAdv
xxx
H
(t

)+
for all hash functions H : / / 0, 1
n
where c
is an absolute constant and t

= t + cTime
H,m
.
In the denition above, the is a placeholder which is either
[m] (for Pre, aPre, Sec, aSec, eSec) or empty (for ePre, Coll).
Pre
ePre aPre
Sec
aSec eSec
Coll
Figure 1. Summary of the relationships among notions of
hash-function security. Solid arrows represent conventional
implications, dotted arrows represent provisional
implications, and the lack of an arrow represents a
separation.
III. TOKEN-FREE BOUNDED DELAY CODES AND HASH
ITERATION
A. Token-free bounded delay codes
In this section we dene token-free codes with bounded
delay. After that we describe how to use them in a hash
iteration technique. At an informal level, a code is a set of
words such that any product of these words can be uniquely
decoded. Of the special classes of codes investigated in the
literature, codes with bounded delay are of interest from the
point of view of many problems in language theory [4].
Denition 5. A code C has bounded delay k 1 from left
to right if, whenever
x
i1
x
i2
...x
i
k
is prex of x
j1
x
j2
...x
jn
,
then x
i1
= x
j1
. Note that xs are code words over C.
Bounded delay from right to left is similarly (analo-
gously) dened by using sufxes. Of course, a code can have
both left to right and right to left k-bounded delay.
The denition states the fact that we need to read k code
words at maximum to identify how to decode the rst word
(from a total of n code words signifying the whole production
over C which we are attempting to decode). One consequence
of that would be that x
i1
x
i2
...x
it
= x
j1
x
j2
...x
jt
, t < k. The
denition stated above is A. Salomaas denition of codes with
bounded delay [4] Also, note that if a code C has bounded
delay k then for every k

k then C is a k

-bounded delay
code as well [4]. An important fact to mention is that the
notion of a bounded delay code C is satisfactory only if C is
a code. As a preliminary notation, note that u, v

, then
u v, denotes that u is a subword in v.
Denition 6. A code C over the alphabet is m-token-free
if and only if there exists
m
such that w = c
i1
c
i2
...c
i
k
,
with c
ij
C

, 1 j k, / w .
Denition 7. A bounded delay code which is also m-token-
free is called m-token-free bounded delay code.
Lemma 1. If a code C is m-token-free then, for any m

> m
then C is m

-token-free as well.
Proof: Let C be an m-token-free code (over the alphabet
). Let , with [[ = m such that, c C

, / c. Consider
the word w = x, where x

, thus w
m+|x|
. Such a
c C

, so that w c, could not exist because otherwise we


would have c which contradicts s election.
In [1] it is shown that it is decidable if a code C A
+
is
an m-token-free code (or not).
B. Hash iteration by token-free bounded delay codes
In this section we show how to use the token-free codes
with bounded delay in the hash iteration technique.
Briey, the preprocessing technique consists of:
1) nd C a token free bounded delay code, and the certi-
cate that ensures that C is token free;
2) preprocess any input message M before the actual
iteration process as follows:
encode M c

(M);
prex c

(M) with the sequence found at the step


2;
pad the message just like the SMD technique.
Thus, the preprocessing procedure is: token-free-proc(M) =
ls-pad(|c

(M)). Note that in the above construction, c

> C

is a function which encodes any input, of arbitrary-


size in a code sequence. We consider it inversable as receiving
a coded sequence decodes it back to its preimage.
We present the token-free hash iteration technique in Algo-
rithm 2.
input : h : 0, 1
n+b
0, 1
n
compression
function
M /, a nite-arbitrary-length message,
K /, the hask-key
output: H

, the digest message


1 (m
1
, m
2
, ..., m
L
) token free proc(m );
2 y
0
0
n
;
3 for i 1 to / do
4 (z
1
, z
2
, ..., z
p
) token free proc( m
i
|
y
i1
);
5 t
1
h
K
(z
1
|z
p
|
n
) ;
6 for j 2 to p do
7 t
j
h
K
(z
j
|t
j1
) ;
8 end
9 y
i
h
K
(m
i
|t
p
) ;
10 end
11 H

= y
L
;
12 return H

;
Algorithm 2: Token-free iteration technique
Lemma 2. If x ,= y, with x, y

( being the al-


phabet for the token-free code) then token-free-proc(x) ,=
token-free-proc(y).
Proof: Assume the contrary. c

(x) = c

(y), for x ,= y.
Contradiction with the fact that C is a code.
Lemma 3. If x ,= y, with x, y

then token-free-proc(x)
cannot be a sufx for token-free-proc(y) and vice-versa.
Proof: Assume the contrary and consider the case when
the former is sufx in the latter. Therefore, we have the
following equation:
token-free-proc(y) = [[token-free-proc(x) = [[[[c

(x) =
[[c

(y).
Which is impossible, because:
1) is chosen so that this could not happen;
2) c

(y) = - the empty word, not possible because C is


a code (this could again be a contradiction with the s
choice);
3) if = then this means that we should have c

(x) =
c

(y), for x ,= y. Contradiction to Lemma 2.


C. Security properties. Proofs
Please note that, when reffering to a hash function, we will
actually refer to a keyed-hash function.
Theorem 1. Let H be the compression hash function fam-
ily with which the token-free with bounded delay iteration
technique is instantiated. Let H

be the hash function family


obtained through the iteration technique process. Then:
Adv
Pre[m]
H
(1) Adv
Pre[m]
H
(1)
Proof: We will use an adversary for the iterated hash
function relative to the preimage-resistance property (call it A)
to construct the following adversary A

for the compression


function:
1) generate k = rand(K);
2) generate M = rand(0, 1
m
) and compute y =
h

(k, M);
3) obtain M

A(k, y) (run adversary A on the iterated


hash function);
4) compute t
p
, where h(M

l
|t
p
) = y and l is the number
of blocks in which M

is split;
5) return M

l
|t
p
, where M

l
is the last chunk obtained from
encoding and splitting M

.
Therefore, because A

runs only A and no other computation


is relevant for its complexity, we have:
Adv
Pre[m]
H
(A

) = Adv
Pre[m]
H
(A)
The fact that the adversary A

is a valid algorithm for


Adv
Pre[m]
H
is based on our following statement: When H is a
compression function then we have:
Adv
Pre[m

]
H
(A

) = Adv
Pre[m]
H
(A

) , m m

with m

being the accepted size for H. Please note two things:


our statement is based on the fact that in the Adv
Pre[m]
H
denition, we truncate the m-sized generated input to m

;
when m < m

we can not apply Adv


Pre[m]
H
since it
assumes to compute a hash value from a random (at least)
m

-sized input.
Because adversary A is the best algoritm to nd a preim-
age for the iterated hash function and Adv
Pre[m]
H
(A

)
Adv
Pre[m]
H
(1), we then have:
Adv
Pre[m]
H
(1) Adv
Pre[m]
H
(1)
Theorem 2. Let H be the compression hash function fam-
ily with which the token-free with bounded delay iteration
technique is instantiated. Let H

be the hash function family


obtained through the iteration technique process. Then:
Adv
ePre
H
(1) Adv
ePre
H
(1)
4
Proof: The idea of the proof is similar to the previous
one. We use an adversary, call it A, for the iterated hash
function (but relative to the everywhere preimage-resistance
in this case) to construct an adversary A

for the compression


function having the same complexity as A. Therefore, because
A is the best algorithm to nd a preimage for the iterated hash
function, similary we prove that:
Adv
ePre
H
(1) Adv
ePre
H
(1)
Theorem 3. Let H be the compression hash function fam-
ily with which the token-free with bounded delay iteration
technique is instantiated. Let H

be the hash function family


obtained through the iteration technique process. Then:
Adv
aPre[m]
H
(1) Adv
aPre[m]
H
(1)
Proof: The idea of the proof is similar to the one in
the Theorem 1. We use an adversary, call it A, for the
iterated hash function (but relative to the always preimage-
resistance property in this case) to construct an adversary A

for the compression function having the same complexity as A.


Therefore, because A is the best algorithm to nd a preimage
for the iterated hash function, similary we prove that:
Adv
aPre[m]
H
(1) Adv
aPre[m]
H
(1)
Lemma 4. Let H be the compression hash function family
with which the token-free with bounded delay iteration tech-
nique is instantiated. Let H

be the hash function family


obtained through the iteration technique process. Then any
collision resistance successful attack to the iterated hash
function, can be traced back to the compression function on
which the iteration relies on.
Proof: Let us suppose an adversary, using its advantage
against the iterated hash function relative to collision resistance
property (Sec), is able to nd two distinct messages M, M

(M

A(k, M)) for which h

(k, M) = h

(k, M

), where
M

,= M, k being the key value which provides the bigest


advantage to the adversary amongst all other possible key
values with M, M

0, 1
m
. Thus the adversary is having M
and M

, with M ,= M

, such that h

(k, M) = h

(k, M

).
Let t
i
, m
i
(respectively t
i
, m
i
) be some intermediate values
in the computation of h

(k, M) (respectively h

(k, M

)),
and l (resp. l

) the number of blocks for the message M


(resp. M

). Without restricting the generality, let us suppose


that l l

. Considering that h

(k, M) = h

(k, M

), we
have the following equation: h

(k, M) = h
K
(m
l
|t
p
) =
h
K
(m

l
|t

p
) = h

(M

). We now have to consider two cases:


m
l
|t
p
,= m

l
|t

p
it follows that h is vulnerable as well
to collision-resistance attack since we are able to nd
m

l
|t

p
hashing into the same value as m
l
|t
p
for h;
m
l
|t
p
= m

l
|t

p
.
When m
l
|t
p
= m

l
|t

p
we obtain that:
m
l
= m

l
;
t
p
= t

p
.
So, t
p
= t

p
, will lead us to: h
K
(z
p
|t
p1
) = h
K
(z

p
|t

1
).
As before, without restricting the generality let us suppose that
p p

. So, for the above equation we have two cases:


z
p
|t
p1
,= z

p
|t

1
it follows that h is vulnerable as
well to collision-resistance attack since we are able to
nd z

p
|t

1
hashing into the same value as z
p
|t
p1
for h;
z
p
|t
p1
= z

p
|t

1
.
The second case, tells us that we can repeat the process:
z
p1
= z

1
.
So, further we obtain that:
z
p1
= z

1
;
t
p2
= t

2
.
Continuing like this we either have:
1) two different inputs hash into the same value (for the
compression function, making it vulnerable as well);
2) in the assumption that p = p

, then we have
t
j
= t

j
; z
j
= z

j
, j = 1, p it follows that
token-free-proc(m
l
|y
l1
) = token-free-proc(m

l
|y

l1
)
it follows that we must have y
l1
= y

1
(since
token-free-proc is using the token-free code c);
3) in the assumption that p > p

there is an x > 1
such that t
u
= t

j
; z
u
= z

j
, u = x, p, j =
1, p

it follows that token-free-proc(m

l
|y

l1
) is suf-
x in token-free-proc(m
l
|y
l1
) contradiction with
Lemma 3.
Thus, since all the other cases are showing that the compres-
sion function is vulnerable as well, we need to repeat the
process for the previous iteration step: y
l1
= y

1
. After
the repetition, similarly, we obtain that:
m
l1
= m

1
;
y
l2
= y

2
.
Otherwise, the compression function h is vulnerable as well.
Clearly, this repetitive process can continue until:
1) two different chunks hash into the same value (the
compression function being vulnerable as well);
2) in the assumption that l = l

, then we have
y
i
= y

i
; m
i
= m

i
, i = 1, l it follows that
token-free-proc(M) = token-free-proc(M

) contra-
diction with Lemma 2;
3) in the assumption that l > l

there is an j > 1 such


that y
k
= y

i
; m
k
= m

i
, i = 1, l

, k = j, l it follows
that token-free-proc(M

) is sufx in token-free-proc(M)
it follows a contradiction with Lemma 3.
Theorem 4. Let H be the compression hash function fam-
ily with which the token-free with bounded delay iteration
technique is instantiated. Let H

be the hash function family


obtained through the iteration technique process. Then:
Adv
Coll
H
(1) Adv
Coll
H
(1)
Proof: We will use an adversary for the iterated hash
function relative to the collision-resistance property (call it A)
5
to construct the following adversary A

for the compression


function:
1) generate k = rand(K);
2) obtain (M, M

) A(k, M) (use adversary A on the


iterated hash function);
3) compute m
1j
|t
1k
, and m
2n
|t
2o
where h
k
(m
1j
|t
1k
) =
h
k
(m
2n
|t
2o
); j in 1..l and j in 1..q, l and q being the
number of blocks in which M, respective M

is split;
4) return m
1j
|t
1k
and m
2n
|t
2o
.
Note that the m
1j
|t
1k
, and m
2n
|t
2o
are the intermediate
values when digesting message M, respective M

. Also
note than, when constructing this algorithm, weve used the
inferrence proven in the above lemma: any collision resistance
successful attack to the iterated hash function, can be traced
back to the compression function on which the iteration
relies on. Therefore, because A

runs only A and no other


computation is relevant for its complexity, we have:
Adv
Coll
H
(A

) = Adv
Coll
H
(A)
The fact that the adversary A

is a valid algorithm for Adv


Coll
H
is based on our following statement. When H is a compression
function then we have:
Adv
Col
H
(A

) = Adv
Coll
H
(A

) , m m

with m

being the accepted size for H. Note two things:


our statement is based on the fact that in Adv
Coll
H
denition, we truncate the m-sized generated input to m

;
when m < m

we can not apply Adv


Coll
H
since, it
assumes to compute a hash value from a random (at least)
m

-sized input.
Because adversary A is the best algoritm to nd a collision
for the iterated hash function and Adv
Coll
H
(A

) Adv
Coll
H
(1),
we then have:
Adv
Coll
H
(1) Adv
Coll
H
(1)
D. Comparison with other existing iteration techniques
As shown in table I, and according to [2] the SMD con-
struction preserves Coll and ePre security, but fails to preserve
any of the other notions. All of the existent schemes preserve
ePre; intuitively, if all of the range of the (randomly keyed)
compression function is hard to invert, then iterating produces
a function whose range is similarly hard to invert. Apart
from ePre, most schemes preserve only collision resistance.
These schemes include SMD, EMD, HAIFA, and Randomized
hash. Of the twelve schemes in the table, besides ROX,
none preserves all seven notions. In fact, the best-performing
existing constructions in terms of property preservation are the
XOR Linear hash and Shoups hash, which still preserve only
three of the seven notions (Coll, eSec, and ePre). The XOR
Tree hash is the only iteration to preserve Pre, and none of
the schemes preserve Sec, aSec or aPre. Remember that the
latter two are particularly relevant for the security of practical
hash functions because they do not rely on the compression
functions being chosen at random from a family.
In relation with the existent iteration techniques, our
token-free scheme is proven to preserve Coll, Pre, ePre,
aPre and neither has proofs or couter-examples regarding
Sec, eSec or aSec. That makes it the best performer,
after ROX. However, the difference between ROX and the
proposed token-free technique is that ROX is more like a
theoretically proven seven property preserving technique,
hard to implement in practice. The reason is that ROX makes
use of two random oracles which are a challenge regarding
the actual implementation. Likewise, its designers say that
[2]: It is quite standard in cryptography for new primitives
to rst nd instantiations in the random oracle model, only
much later to be replaced with constructions in the standard
model. So, without entering into implementation details,
ROX designers only suggest an implementation sketch to
either reuse the compression function about three times as
many rounds as normal (with different values of constants) -
admitting that this violates good cryptographic hygiene, either
by using calls to a blockcipher like AES - which is designed
independently of the compresion function. Regarding this
aspect, our technique is much more easier to implement since
nding token-free codes with bounded delay seems to be far
more easier than implementing a specic random oracle [1].
Regarding generic attacks (like multi-collisions, second-
preimage search with expandable message [5] or the herding
type [6]) to which the original MD construction is proven
not to be resistent, neither token-free nor the ROX technique
have proven secure. Moreover, we are aware that our current
proposed technique is clearly susceptible to these type of
attacks as we do expose our whole inner state (by the hash
digest returned). However, the Sponge approach seems to have
proven security regarding length expansions attack, therefore
working on proposing an iteration technique that is resilient
to those attack types as well is part of our future work. In [7],
the Sponge authors propose to work in a squeeze mode (to
truncate the nal result), but for now this is not a solution,
as our the proof regarding the preservation of Coll will be no
longer valid (as we can not guarantee the fact that a collision
of the iterated hash function can be traced back to a collision
towards the compression function). Still, regarding the possible
squeeze mode for MD like construction techniques, the Sponge
designers advise that MD functions should work with an larger
inner state and in the end just truncate the nal chaining
value to the desired hash length. Sponge authors also say that
this should be done even if the reduction proof for collision
resistance will be no longer valid because there is not evidence
that desigining xed-length compression function would be
easier in the rst place. That being said, the resistance of the
resulted function would be limited by the size of the inner
state (successfully applied in the new SHA-3: Keccak).
In relation to SMD, which provides security preservation
only for Coll and ePre, our technique brings a security im-
provement by allowing the preservation of other two security
properties (but with a performance trade-off). Also, we learned
that using token-free codication only in the preprocessing
6
phase of the plain SMD technique will not enhance its security
since the CE1 from [2] will apply to the newly resulted
iteration technique as well.
In conclusion, even though we do not have proven security
for Sec, eSec, and aSec, our technique seems to be more
practical than ROX. Likewise, since our token-free approach
follows the MD construction model it is susceptible to multi-
collision attacks. In order to be resistant to those attacks,
investigating how to benet from the Sponge approach and
token-free codes at the same time, as well as trying to
model the ROX random oracles or the padding technique
of the Sponge functions with token-free codes are different
investigation tracks which are related to our future work.
IV. CONCLUSIONS
We contributed to the eld rst by introducing the notion of
token-free bounded delay codes, after which we showed that
the token-free with bounded delay codes iteration technique
can support the hash security of a compression hash function.
We are able to preserve all of the following security properties:
Pre, aPre, ePre, Coll. Moreover, regarding the preservation
of the seven notions of hash security, the token-free iteration
technique seems to be the best-performer after ROX. However,
even though we do not have proven security for Sec, aSec, and
eSec, our technique seems to be more practical than ROX. The
main reason is that ROX works in the random oracle model,
as opposed to the standard working model of our technique.
REFERENCES
[1] S.C.Dit u Can we use codes with bounded delay on hash iteration
techniques? Bachelor Thesis 2011, coordinated by prof. dr. F.L.T iplea
[2] P.Rogway, T.Shrimpton Cryptographic Hash-Function Basics: Def-
initions, Implications, and Separations for Preimage Resistance,
Second-Preimage Resistance, and Collision Resistance Fast Software
Encryptions 2004, LNCS No. 3017, pages 371-388.
[3] E. Andreeva, G. Neven, B. Preneel, T. Shrimpton Seven-property-
preserving. Iterated Hashing: ROX Asiacrypt 2007.
[4] A. Salomaa Jewels of formal language theory Computer Science Press,
1981. University of Turku, Finland.
[5] J. Kelsey, B. Schneier. Second preimages on n-bit hash functions for
much less than 2
n
work. Eurocrypt 2005, LNCS No. 3494, pages 474-
490.
[6] J. Kelsey, T. Kohno. Herding hash functions and the Nostradamus
attack. Eurocrypt 2006, LNCS No. 4004, pages 183-200.
[7] G. Bertoni, J. Daemen, M.Peeters, G. V. Assche. Sponge Functions.
Ecrypt Hash Workshop 2007.
[8] G.Bertoni, J. Daemen, M. Peeters, G. V. Assche. The KECCAK refer-
ence. NIST SHA-3 Competition, 2007-2012

You might also like