You are on page 1of 11

CS702 SEMINAR

Crytographic Hash Functions


G.Sanjay CS93128
Instructor : Prof. C.Pandurangan

1
1 Introduction
Cryptographic Hash functions have a vital signi cance when discussed in
the context of Signature Schemes. In this section, a brief overview of Sig-
nature Schemes is provided, along with a discussion regarding the need for
employment of hash functions.

1.1 Overview of Signature Schemes


Conventionally, a hand-written signature attached to a document is used to
specify the person responsible for the document. A signature is used in every-
day situations such as writing a letter, withdrawing money from a bank and
signing a contract. A signature scheme is a method of signing a message
stored in electronic form. A fundamental aspect regarding signature schemes
is that a copy of the signed digital signature is identical to the original and
care must be taken to prevent a signed digital message from being reused.
A signature scheme consists of:
 A secret signing algorithm sig.
 A public veri cation algorithm ver

sig tx ver(x,y)
x y = sig(x) true
<x,y>

In the above gure, the sender prepares the message x, and generates the
signature y using the secret signing algorithm. He then transmits < x; y >
across the network. The receiver now uses the public veri cation algorithm
and ensures that y is indeed a valid signature for x for that sender.
An important aspect of signature schemes is the prevention of forgery.
Several signature schemes have been proposed and they include the ElGamal
2
scheme, the Digital Signature Standard and the Lamport Signature Scheme.

1.2 Need for Hash Functions


An important problem with signature schemes is that only small messages are
allowed to be signed. For example, when using the DSS, a 160-bit message
is signed with a 320-bit signature. Most messages are pretty long in reality.
The only way to sign them therefore is to split them into smaller chunks and
sign each individual chunk. However, this is infeasible because
 Long signatures, longer than the message itself result.
 Each chunk requires complicated arithmethic operations such as mod-
ular exponentiation, which makes the signing of messages extremely
slow.
 Most importantly, there is no means by which the integrity of the mes-
sage as a whole may be protected, as while each individual chunk is
signed, there is no way by which a single signature could be associated
with the whole message.
The solution is to use a hash function which takes a message of arbitrary
length and produces a digest of a speci ed size. The digest is then signed.

hash sign
Msg x h(x) digest z sig(z) signature y

As shown in the above gure, the message x is hashed using the public
hash function h, to get the digest z. The digest z is then signed using the
secret algorithm sig, to get the signature y. Next < x; y > are transmitted
across the channel. To verify, the receiver reconstructs z = h(x) (h is public)
and checks if ver(z; y) = true.

3
2 Desired Properties of Hash Functions
An important need that signature schemes must satisfy is that they must
prevent forgery of signatures. The use of hash functions must be such that
this aspect of signature schemes is not compromised. In the discussion that
follows, for concreteness, let us assume that Bob is sending a message to
Alice. Oscar is a forger, who tries to forge Bob's signature. Each kind of
forgery that Oscar can commit is discussed along with the property that hash
functions must possess to prevent such a forgery from taking place.
Forgery 1
Assume that Bob transmits < x; y >, z = h(x), y = sig(z).
Oscar nds x0; x0 6= x and h(x) = h(x0).
Then, Oscar transmits < x0; y > a forgery.
Property 1
A hash function h is weakly collision free if given a message x, it is compu-
tationally infeasible to nd x0 x0 6= x; h(x) = h(x0)
Forgery 2
Oscar nds some two x, x0, x 6= x0 and h(x) = h(x0). Oscar gives Bob x and
asks him to sign it obtaining y. Then, < x0; y > is a forgery by Oscar.
Property 2
A hash function h is strongly collision free if it is computationally infeasible
to nd x, x0 such that x 6= x0; h(x) = h(x0).
Forgery 3
In some signature schemes, it is possible to compute a signature on a random
z. Example, schemes where ver(z; y) involves computing z0 from y and veri-
fying if z0 = z. Then, Oscar nds x such that z = h(x). Then, < x; y > is a
valid forgery.
Property 3
A hash function h is one-way if given a message digest z, it is computationally
infeasible to nd x such that h(x) = z.

4
2.1 Relationship between the various properties
In the previous section, we have seen that it is important for hash functions
to be strongly collision free, weakly collision free as well as one-way. In this
section, we will nd a very intersting relationship between these three prop-
erties - namely that if a hash function is strongly collision free, it is then
one-way and weakly collision free too. The fact that a strongly collision free
hash function is also weakly collision free is obvious. We will now discuss
why a strongly collision free hash function is one-way.

Theorem 1 Suppose h : X ! Z is a hash function where jX j,jZ j are nite


and jX j  2  jZ j. Suppose A is an inversion algorithm for h. Then, there
exists a probabilistic Las Vegas Algorithm which nds a collision for h with
probability at least 1=2.
Proof:
Consider the Algorithm B presented below.
Algorithm B

1. Choose a random x 2 X .
2. Compute z = h(x).
3. Compute x1 = A(z).
4. If x1 6= x then
x1 and x collide under h
else Quit.
Clearly the above algorithm is a probabilistic algorithm of the Las Vegas
type, since it either nds a collision or returns no answer. It remains to com-
pute the probability of success.
For any x 2 X , de ne xRx1 if h(x) = h(x1)
Clearly, R is an equivalence relation.
De ne [x] = fx1 2 X : xRx1g
Evidently, the number of equivalence classes is at most jZ j.
5
Let the set of all equivalence classes be C .
Clearly,

p(success) = (1=jX j) 
X j[x]j 1
x2X j[x]j
= (1=jX j) 
X X jcj 1
c2C x2c jcj
=
X
(1=jX j)  (jcj 1)
c2C
 jX jjX jjZ j
 1=2

2.2 Choice of Digest Size


An important aspect regarding hash functions is the choice of the size of the
message digest. This choice is dictated by what is commonly known as the
Birthday Attack. This says that in a group of 23 people, probability that 2
of them have the same birthday > 1=2.
It canpbe easilypseen by straightforward probability arguments that hash-
ing over n (1:17 n) elements of X (onto n slots) yields a collision with a
probability > 1=2. Thus, if the size of message digest = 40 bits, then we are
quite likely to nd a collision with 220 random hashes. Keeping this in view,
it is recommended that the message digest be at least 128 bits long.

6
3 Case Studies of Hash Functions
In this section, we will study some of the landmark has functions that have
been propsed and are being used in practice.

3.1 A Discrete Logarithm Hash Function


In this section, we describe a hash function, due to Chaum, van Heijst and
P tzmann, that will be secure as long as a particular discrete logarithm can-
not be computed. This hash function is not fast enough to be of practical
use, but it is conceptually simple and is an example of a function that can
be secure under a reasonable computational assumption.
The hash function is presented below.
Suppose p is a large prime and q = (p 1)=2 is also prime. Let , be prim-
itive elements of Zp . log is not public, and assume it is computationally
infeasible to compute its value. The hash function
h : f0; : : :; q 1gX f0; : : : ; q 1g ! Zp f0g
is de ned as
h(x1; x2) = x1 x2 mod p
It is possible to show that this hash function is as secure as it is dicult to
compute the discrete logarithm problem.

Theorem 2 Given one collision for the CHP hash function h, the discrete
logarithm log can be computed eciently.

3.2 Extending a Hash function to an In nite Domain


So far, hash functions with a nite domain have alone been considered. We
now study how a strongly collision free hash function with a nite domain
can be extended to a strongly collision free hash function with an in nite
domain. This will enable signing of messages or arbitrary length.

7
Suppose h : (Z2)m ! (Z2)t is a strongly collision free hash function, where
m  t + 1. It is now possible to construct the strongly collision free hash
function h : [inf
i=m (Z2 ) .
i

First the case where m > t + 1 is considered.h may be constructed as de-


scribed below.

1. Express x as x1jjx2jj : : :xk, jx1j = jx2j : : : = jxk 1j = m t 1 and


jxkj = m t 1 d; 0  d  m t 2.
2. for i : 1 ! k 1 do yi = xi.
3. yk = xk jj0d.
4. yk+1 = binary reprn of d (padded to left with 0 s) 0

5. g1 = h(0t+1jjy1).
6. for i : 1 ! k do gi+1 = h(gi jj1jjyi+1).
7. h(x) = gk+1 .
For the case where m = t + 1, a di erent construction is required as shown
below.

1. Let y = y1y2 : : : yk = 11jjf (x1)jjf (x2)jj : : : f (xn), where f (0) = 0; f (1) =


01.
2. g1 = h(0tjjy1).
3. for i = 1 to k 1 do gi+1 = h(gijjyi+1)
4. h(x) = gk

8
3.3 Practical Hash Functions
The hash functions discussed so far are too slow to be useful in practice. In
this section, we discuss the MD4, MD5 and the SHS schemes. The MD4 Hash
Function was proposed in 1990 by Rivest.
The MD4 function works as follows.
Given a bitstring x, M is constructed of length divisible by 512. This con-
struction includes padding zeroes, though the exact construction details are
more complicated. M is now broken into 32 bit words, and if N is the total
number of words, it is to be noted that N is divisible by 16. Having con-
structed M in this fashion, a 128 bit digest is now constructed. Groups of
sixteen words of M are formed. Each group goes through three rounds of
hashing. In each hashing round, one operation is performed on each of the
sixteen words of x. The steps may be summarized as follows. In the steps,
it is to be noted that A,B ,C and D are all registers, and the nal digest is
obtained by concatenating their contents.

1. Initialize A,B,C,D.(registers)
2. for i = 0 to N=16 1 do
3. for j = 0 to 15 do X [j ] = M [16i + j ]
4. AA = A; BB = B; CC = C; DD = D.
5. Hash Rounds 1,2,3
6. A = A + AA; B = B + BB; C = C + CC; D = D + DD.
Each hashing round, as mentioned earlier involves 16 steps. While an
enumeration of the steps is hardly instructional, the rst few steps of Hashing
Round 1 are alone provided below.
First few steps of HR1
A = (A + f (B; C; D) + X [0]) << 3
D = (D + f (A; B; C ) + X [1]) << 7
C = (C + f (D; A; B ) + X [2]) << 11
9
B = (B + f (C; D; A) + X [3]) << 19
:::
where, the function f is given by
f(x,y,z) = (x and y) or ((not x) and z).
It is to be noted that the steps listed above are cyclic. Importantly, all
the operations that are performed are boolean logical operations or addition
operations that are extremely simple and fast. The MD4 is consequently
extremely fast, and software implementations have attained speeds of 1.4
MBytes being hashed in 1 second.
However, it is dicult to say anything conclusive regarding the security
of the MD4 hash function as it is not based on a well-studied problem such
as the Discrete Logarithm or Factoring problem. Con dence in the security
of the system can only be attained over time, as the system is studied and
not found to be insecure. Although the MD4 has not been broken, weakened
versions that omit either the rst or the third round have been broken without
much diculty.
A strengthened version of the MD5 has been proposed in 1991. The MD5
uses four rounds instead of three, but runs about 30% slower than the MD4.
The Secure Hash Standard is based on the same principles but is more
complicated and slower. It was adopted as a standard on May 11, 1993. The
SHS produces a 5-register(160-bit) digest, unlike the MD4 which produces
a 128 bit digest.

10
4 Timestamping of messages
One diculty with signature schemes is that a signing algorithm may be com-
promised. In the event of Oscar compromising Bob's signature scheme, an
important problem, even more serious than the fact that the signature scheme
had been compromised, is the fact that the authenticity of all messages signed
by Bob, including those he signed before Oscar compromised the scheme are
now called into question. Further, there is the possibility that Bob may dis-
own messages, which are actually his.
The reason these types of events can occur is that there is no way to de-
termine when a message is signed. This problem is tackled by timestamping
messages. A timestamp provides proof that a message was signed at a par-
ticular time. Thus if Bob's signing algorithm was compromised, it would not
invalidate any signatures he made previosuly.
We will discuss two ways of achieving timestamping.
First, Bob can make use of some public information pub based on some
information that could be available only today. For example, this informa-
tion could be based on news reports or share prices of that particular day
which could have not been known prior to that day. This information is used
by Bob along with any message sent that particular day. This information
is published by Bob the next day in a manner accessible to everyone. This
ensures that Bob did not compute the message after the day in question. In
this manner, Bob's message can be accurately located to within a day.
Second,If there is a trusted timestamp service available, Bob can send
his message to this service, which then appends the date and signs the entire
message. This is secure as long as the TSS is secure. More complicated
schemes have been suggested for the case where it is undesirable to trust the
TSS unconditionally.

11

You might also like