Hash algorithms are primarily used to check if a message has changed after transmission, it is a type of checksum mechanism to ensure the message integrity, it is also used in many security applications, including password storage, digital signatures, and more. A “hash value” or “message digest” is a fixed-length value depending on the algorithm, calculated from the message you wanted to apply hash function. Even a small change in the input to the hash function will produce a significant different hash value. The hashed information (hash value) is often meaningless in terms of its content, its primary purpose is to be matched with the receiver generated hash to verify the integrity of the message.

Hash function

A cryptographic hash function has the following characteristics. Although they are different cryptographic primitives, the hashing process is similar to the Cipher Block Chaining (CBC) in block cipher encryption. The Hash Extension Attack leverages these characteristics.

  • Internal state
    • Hash functions maintain an internal state that is updated with each block of data processed, a hash function starts with an initial state called Initialisation Vector (IV), which is a fixed valued defined by the hash function specification. Each block of the message is processed in conjunction with the current state (IV). The internal state updates in each iteration
  • Message padding
    • Before hashing, the message is padded to a length that is a multiple of the hash function’s block size. This allows the function to efficiently process the message in fixed-size chunks
    • The specific padding scheme varies depending on the hash function. It typically involves adding a specific pattern of bits to the end of the message, along with the original message length encoded in a particular way
  • Iterative processing
    • The message is processed iteratively in fixed-size blocks (e.g. 512 bits for SHA-256)
    • The output of each block becomes the new IV for the next block to process, this is called the chaining mechanism.

Properties of hash algorithms

  • The hash is one-wayness, it is technically impossible to reverse-hash, to obtain the original message
  • It is easy to compute the hash for a any given data
  • It is difficult to construct a text that has a given hash (second pre-image resistance)
  • It is difficult to modify a given text without changing its hash (uniqueness)
  • It is unlikely that two different messages will have the same hash (collision resistance)
  • Uniqueness
    • Different messages should result in different digests, even a small change in the input message should result in a significantly different hashed value.
  • Second pre-image resistance
    • Given an input x and its hash H(x), it is computationally infeasible to find a different input y such that H(x) = H (y), where x y. In another word, if you have a specific input and its hash value, it should be difficult to find another input that hashes to the same value.
    • A second pre-image attack is when an attacker can replace the original message with fraudulent data.
  • Collision resistance
    • It should be computationally challenging to find two different message (n and m) that produce the same hash value. In mathematical terms, given a hash function H, it should be difficult to find n m such that H(n) = H(m).

Second pre-image resistance VS collision resistance

A second pre-image attack is given an input and its hash value as target, find another input with the same hash value. A collision is two different inputs produce the same hash value, unintentionally, without any specific initial target.

The MD5 and SHA1 are both incompetent in collision resistance.

Hash used in message integrity

Hash algorithms are primarily used to check the integrity of a message. When sender generates a hash of a message and sends it along with the message, the recipient can independently calculate the hash of the received message and compare it to the received hash. If the two hashes match, it suggests that the message has not been altered during the transmission. A real life example is HMAC.

Hash used in password security

Hash transforms plaintext password into hash value. It is used across different scenarios including password storage and password verification.

Storing password securely

When a user creates a password, the password is passed through a cryptographic hash function. The hash function converts the plain-text password into a unreadable, irreversible hashed string. Instead of storing the plaintext password in the database, the hashed password is stored. This way, even the database is compromised, the attacker cannot easily break the hashed password to obtain the original password.

Verifying password

When the user attempt to log in, they entered the password. The system hashes the entered password using the same hash function used during user registration. The system compares the hash value of the input password with the stored hash value in the database. If the hashes match, the password is correct.

Enhance hashed password security

Although passwords are hashed in databases, they are not guaranteed to be secured. Attackers can use dictionary or rainbow attacks to possibly crack the hashed passwords. The attacker’s methods and preventive measures are discussed in Password Based Attacks.

SHA-256 in Python

SHA-256 is a USA standard hash algorithm, Python provides a library hashlib for this.

import hashlib
 
h = hashlib.sha256("green eggs")
h.hexdigest()
h.update(" and ham")
h.hexdigest()

Program output:

'6a3d501466f63d04d8ecd9ff3efe376e7784d0a3bbb21a9ce2fe4be12f77fbd2'
'a113a9854ab71a4914e219e181ca8bfd48d7d65bdb1c3cb1bad6235c5f1acf23'

Back to parent page: Network Security and Cryptography

Computer_networksNetwork_securityCyber_Security Hash_algorithmINFO1112INFO2222

Reference - H. Nahari and R. L. Krut, Web Commerce Security: Design and Development_. Wiley & Sons