Merkle-Damgård construction is a method of building collision-resistant cryptographic hash functions from collision-resistant one-way compression functions. This construction was used in the design of popular hash algorithms such as MD5, SHA1 and SHA2.

To make it easier to understand:

Merkle-Damgård constructs a variable-length collision resistant hash function from a fixed-length collision-resistant hash function .

Construction

Suppose:

  • Block size is .

  • inputs length and outputs length .

  • inputs whose length is .

The construction is as follows:

  1. Set block size . Pad with s so that its length is multiple of . Now we have -bit blocks .

  2. (Length Padding) Let , where is encoded using exactly bits. Now we have .

  3. Let . This is the initialization vector (IV).

  4. For , compute .

  5. Output .

A picture is more than a thousand words:

Merkle-Damgard_hash_big.svg

Proof of security

Theorem: If is collision-resistant, then so is .

Proof: We prove by contraposition. Suppose is not collision-resistant, which means we can easily find 2 different inputs such that . Let and be the input blocks. Recall that and . There are 2 cases:

  1. . In this case we only look at the last block:

    Because , we have . So we have found a collision for .

  2. . This implies . In this case we look backwards from the last block:

    • We have .

      If or , then we have found a collision for . Else, we have and:

    • We have .

      If or , then we have found a collision for . Else, we have and:

    • … Else, we have . But if we get here, then for all , which implies . This is a contradiction.

MD-compliant padding

The padding scheme used in the Merkle–Damgård construction must be chosen carefully to ensure the security of the scheme. The above construction uses length padding, but there are other MD-compliant padding schemes as well. For details about MD-compliant padding, check Wikipedia.

References