Secure Digest Functions
Lots of ways are available to produce a fixed-length bit pattern that characterizes an arbitrary-length message or document. Perhaps simplest is to use the XOR operation iteratively to combine fixed-length pieces of source document. Such a function is often used in communication protocols to produce a short fixed length hash to characterize a message for error-detection purposes, but it is inadequate as basis for a digital signature scheme. A secure digest function h = H(M) should have the following properties:
1) Given M, it is easy to compute h.
2) Given h, it is hard to compute M.
3) Given M, it is hard to find another message M’, such that H(M) = H(M’).
Such functions are also called one way hash functions. Reason for this name is self evident based on the first two properties. Property 3 demands an additional feature, even though we know that result of a hash function cannot be unique (because the digest is an information reducing transformation), we need to be sure that an attacker, given a message M that produces a hash h, cannot discover another message M’ that also produces h. If an attacker could do this, then they could forge a signed document M’ without knowledge of signing key by copying signature from signed document M and appending it to M’.
Secure digest function needs to be carefully designed. Bit level operations used and their sequencing are similar to those found in symmetric cryptography, but in this case operations need not be information preserving, since function is definitely not intended to be reversible. So a secure digest function can make use of full range of arithmetic and bit wise logical operations. Length of source text is usually included in digested data.
Two widely used digest functions for practical applications are MD5 algorithm (It’s fifth in a sequence of message digest algorithms developed by Ron Rivest) and SHA-1 (Secure Hash Algorithm), which has been adopted for standardization by US National Institute for Standards and Technology (NIST). Both have been carefully tested and analyzed and can be considered adequately secure for foreseeable future, while their implementations are reasonably efficient. We describe them briefly here. Schneier and Mitchell et al survey digital signature techniques and message digest functions in depth.
MD5
MD5 algorithm uses four rounds, each applying one of four nonlinear functions to each of 16 32 bit segments of 512 bit block of source text. Result is 128 bit digest. MD5 is one of the most efficient algorithms currently available.
MD5 digests have been widely used in the software world to provide assurance about integrity of transferred file. For example, file servers often provide pre computed MD5 checksum for files, so that a user can compare checksum of downloaded file to it.
SHA-1
SHA-1 is an algorithm that produces 160 bit digest. It is based on Rivest’s MD4 algorithm (which is similar to MD5), with some additional operations. It is substantially slower than MD5, but 160 bit digest does offer greater security against brute force and birthday-style attacks. SHA algorithms that deliver longer digests (224, 256 and 512 bits) are also included in standard. Of course, their additional length implies additional costs for generation, storage and communication of digital signatures and MACs, but following publication of attacks on SHA-1’s predecessors, which suggest that SHA-1 is vulnerable, NIST announced that is to be superseded by the longer SHA digest versions in US government software by 2010.