You are on page 1of 27

A

SEMINAR ON

MESSAGE DIGEST
ALGORITHM

Introduction
A message digest is a compact digital signature for an arbitrarily long stream of binary data. An ideal message digest algorithm would never generate the same signature for two different sets of input. Message digest algorithms have much in common with techniques used in encryption, but to a different end.

Steps…

Why MD5 …???
Many older programs requiring digital signatures employed 16 or 32 bit cyclical redundancy codes (CRC) originally developed to verify correct transmission in data communication protocols, but these short codes, while adequate to detect the kind of transmission errors for which they were intended, are insufficiently secure for applications such as electronic commerce and verification of security related software distributions.

History of MD5……..
The most commonly used present day message digest algorithm is the 128 bit MD5 algorithm, developed by Ronald Rivest (also one of the inventors of RSA) of the MIT Laboratory for Computer Science and RSA Data Security, Inc. The algorithm, with a reference implementation, was published as Internet RFC 1321 in April 1992, and was placed into the public domain at that time. Message digest algorithms such as MD5 are not deemed "encryption technology" and are not subject to the export controls some governments impose on other data security products.

Hashing…
A hash function or hash algorithm is a function for examining the input data and producing an output hash value. The process of computing such a value is known as hashing. The process of hashing has the property that two different inputs are unlikely to hash to the same hash value.

Hashing contd…
A fundamental property of all hash functions is that if two hashes (according to the same function) are different, then the two inputs were different in some way. This property is a consequence of hash functions being deterministic, mathematical functions, but the equality of two hash values does not guarantee the two inputs were the same unless the function is one-to-one. More typically, probability theoretic or computability theoretic properties apply to the case of equal hash values.

Digital Signatures……..
 Hash value of a message when encrypted with the

private key of a person is his digital signature on that e-Document
– Digital Signature of a person therefore varies from document to document thus ensuring authenticity of each word of that document.

– As the public key of the signer is known, anybody can verify the message and the digital signature

Why Digital Signatures ???
 To provide Authenticity, Integrity and

Non-repudiation to electronic documents.  To use the Internet as the safe and secure medium for e-Commerce and e-Governance.

Advantages…
 Impossible to Generate Two messages with same

digest.  Compact, does not need Large look up tables.  It’s a command line Utility usable either in Unix or Windows.  Useful in Shell Scripts (Perl programs for software installation), File comparison & detection of file corruption & tampering.  Integrity Checking.

Disadvantages…
 Vulnerability.
– MD2 was shown to be vulnerable to a pre-image attack with time complexity equivalent to 2104 applications of the compression function. – it is easy to generate MD4 and MD5 collisions

Collision Attacks…
A collision attack finds two messages with the same hash, but the attacker can't pick what the hash will be. The attacks announced at CRYPTO 2004 are collision attacks, not pre-image attacks. Collisions can be a problem for systems that involve signed code. In particular, a collision attack can enable adversaries to construct an innocuous program and a malicious program with the same hash.

Algorithm…

Explanation…
MD5 processes a variable length message into a fixed-length output of 128 bits. The input message is broken up into chunks of 512-bit blocks; themessage is padded so that its length isdivisible by 512. The padding works as follows:first a single bit , 1, is appended to the end of the message. This is followed by as many zeros as are required to bring the length of the message up to 64 bits fewer than a multiple of 512. The remaining bits are filled up with a 64-bit integer representing the length of the original message.

contd…
The main MD5 algorithm operates on a 128-bit state, divided into four 32-bit words, denoted A, B, C andD. These are initialised to certain fixed constants. The main algorithm then operates on each 512-bit message block in turn, each block modifying the state. The processing of a message block consists of four similar stages, termed rounds; each round is composed of 16 similar operations based on a non-linear function F, modular addition, and left rotation. Figure 1 illustrates one operation within a round.

MD5 vs MD4…
 A fourth round has been added.  Each step now has a unique additive constant.  The function g in round 2 was changed from (XY v

XZ v YZ) to (XZ v Y not (Z)) to make g less symmetric.  Each step now adds in the result of the previous step. This promotes a faster "avalanche effect".  The order in which input words are accessed in rounds 2 and 3 is changed, to make these patterns less like each other.  The shift amounts in each round have been approximately optimize to yield a faster "avalanche effect." The shifts in different rounds are distinct.

Avalanche effect…
In cryptography, the avalanche effect refers to a desirable property of cryptographic algorithms, typically block ciphers and cryptographic hash functions. The avalanche effect is evident if, when an input is changed slightly (for example, flipping a single bit) the output changes significantly (e.g., half the output bits flip). In the case of quality block ciphers, such a small change in either the key or the plaintext should cause a drastic change in the cipher text. The actual term was first used by Horst Feistel, although the concept dates back to at least Shannon's diffusion.

Avalanche contd…
If a block cipher or cryptographic hash function does not exhibit the avalanche effect to a significant degree, then it has poor randomization, and thus a cryptanalyst can make predictions about the input given only the output. This may be sufficient to partially or completely break the algorithm. It is thus not a desirable condition — at least from one point of view. Constructing a cipher or hash to exhibit a substantial avalanche effect is one of the primary design objectives. This is why most block ciphers are product ciphers. It is also why hash functions have large data blocks.

SHA
The SHA (Secure Hash Algorithm) family is a set of related cryptographic hash functions. The most commonly used function in the family, SHA-1, is employed in a large variety of popular security applications and protocols, including TLS, SSL, PGP, SSH, S/MIME, and IPSec. SHA-1 is considered to be the successor to MD5, an earlier, widely used hash function. The SHA algos were designed by the National Security Agency (NSA) and published as a US government standard.

SHA contd…
The first member of the family, published in 1993, is officially called SHA; however, it is often called SHA0 to avoid confusion with its successors. Two years later, SHA-1, the first successor to SHA, waspublished. Four more variants have since been issued with increased output ranges and a slightly different design: SHA-224, SHA-256, SHA-384, and SHA-512 — sometimes collectively referred to as SHA-2.

For more details…
Refer to ::
 http://en.wikipedia.org/wiki/MD5  http://www.fourmilab.ch/md5/  www.cryptography.com/cnews/hash.html  http://www.unix.org.ua/orelly/java-ent/security/

ch0703 .htm  www.ietf.org/rfc/rfc1320.txt

Questions ????...

IF ANY

Thank You