This article provides a comprehensive overview of cryptographic hash functions, exploring their history, characteristics, applications in blockchain, and working principles. Cryptographic hash functions play a crucial role in ensuring the security and reliability of blockchain technology.

History of Cryptographic Hash Function

Characteristics of Cryptographic Hash Functions

Cryptographic Hash Function in Blockchain

How Cryptographic Hash Function Works in Blockchain

The Future of Cryptographic Hash Functions

Conclusion

History of Cryptographic Hash Function

Cryptographic hash function (CHF) has been in existence since the 1980s and finds widespread utility in cryptography, data integrity verification, database indexing, and various other domains.

When computing a cryptographic hash function, data of arbitrary length is inputted, and the corresponding function transforms it into a fixed-length output value. This transformation involves applying various operations to different segments of the input data, commonly referred to as Hash. The resulting output is known as the hash value, with the operation itself termed the hash function.

An illustrative example is the MD5 Algorithm, frequently used in P2P downloaders, characterized by a 128-bit length. Users can compare the hash value of a downloaded file with the one provided by the source; a match indicates the file’s likely integrity.

Another prevalent application is password authentication on websites. To safeguard user passwords, most platforms store hashing passwords rather than plaintext entries. When a user logs in, the system computes the hash function of the entered password and compares it with the stored value associated with the username. The cryptographic hash function’s irreversible nature safeguards against password decryption even if hackers obtain database hash values.

Characteristics of Cryptographic Hash Functions

Searching ‘SHA256 Generator’ reveals that different websites utilizing the same algorithm consistently generate identical hash values for similar input texts.

Furthermore, altering the input text’s case results in entirely distinct output hash values, known as the Avalanche Effect. The following characteristics gauge a cryptographic hash function’s security:

Pre-image resistance: Retrieving the original input value from the output hash value is highly challenging due to the one-way function’s properties.

In the aforementioned example, reconstructing a user’s password from stolen hash values poses significant difficulty. The complex operations and information compression within cryptographic hash functions hinder reverse engineering, emphasizing their unidirectional nature.

Second pre-image resistance: Identifying another input value producing the same hash value given an initial input is arduous. **This feature is named as weak collision resistance.
Collision resistance: Discovering two distinct values yielding identical hash values is challenging and termed a cryptographic hash collision. This property denotes strong collision resistance.

Taking the mentioned MD5 as an example, is it possible for different files to generate the same hash value? The answer is yes, but the probability is extremely low. This phenomenon is known as a cryptographic hash collision, which can occur either accidentally or through deliberate attack. The standard collision probability for the MD5 algorithm is about 1/2¹²⁸, making accidental occurrences very unlikely. However, MD5 is considered vulnerable to deliberate collision attacks, as producing the same hash value for two different plaintexts is relatively easy. Therefore, while the MD5 algorithm can still be used for tasks that do not involve security, it is no longer suitable for security authentication tasks (such as key authentication or digital signatures).

Cryptographic Hash Function in Blockchain

Ethereum uses the cryptographic hash function KECCAK-256, which many people mistakenly identify as SHA-3 (including in the doctoral thesis of Celestia’s founder) because this function was originally written as ‘sha3’ in Solidity. Due to the confusion, it was subsequently renamed to Keccak256.

MetaMask utilizes various cryptographic hash functions in its operations:

A set of 12 words from a randomized combination of 2048 BIP39 proposal words forms auxiliary words.
Each word corresponds to a value, collectively creating seed integers.
MetaMask applies the SHA-256 function to the seed integer, generating a private key for importing existing wallets. This is sometimes what is required to be entered when importing an existing wallet on a new device.
The ECDSA algorithm processes the private key to derive a public key.
MetaMask generates a hash of the public key using the Keccak-256 function, taking the last 20 bytes of the hash (converted to hexadecimal, i.e., a length of 40 letters or numbers) and prefixing it with a 0x, which becomes the ETH address.

How Cryptographic Hash Function Works in Blockchain

Bitcoin utilizes the SHA-256 cryptographic hash function. Here, we will elucidate the process through which Bitcoin miners engage with cryptographic hash functions during mining activities.

In Bitcoin mining, miners amalgamate transaction data with a block header, comprising transaction details alongside metadata like timestamps and random numbers. Miners strive to produce a specific SHA-256 hash by iteratively adjusting the random numbers(referred to as “nonce”), in the block header aiming to meet specific criteria, typically commencing with a set number of leading zeros. Given the nature of the SHA-256 hash function, the sole method to discover a compliant hash is through continual experimentation with different random numbers.

Upon finding a hash that fulfills the requirements, miners can append the block to the Bitcoin network’s blockchain and receive a designated quantity of Bitcoins as a reward. This process, known as “mining,” involves ongoing execution of hash functions to identify a hash value meeting the specified criteria.

Beyond mining, cryptographic hash functions are pivotal in establishing links between blocks and tracking transaction alterations within blockchain systems. Hash pointers serve as data structures facilitating data indexing, retrieval, and verification of data modifications. Each transaction within the blockchain undergoes hashing before being organized into blocks. Subsequently, a hash pointer connects each block to its antecedent by storing a hash of the preceding block’s data. The interconnected nature of blocks ensures immutability within the blockchain; any modification to a transaction results in a distinct hash value, consequently altering the hashes of all subsequent blocks. For instance, consider a blockchain comprising two blocks:

Block 1: encompasses hashes of transactions T1, T2, and T3.
Block 2: features hashes of transactions T4, T5, and T6, along with Block 1’s hash.

Should an individual attempt to tamper with transaction T1 in Block 1, they would need to recalibrate Block 1’s hash value and update the new value in Block 2. However, due to the unidirectional and Pre-image resistance nature of cryptographic hash functions, reversing transaction T1 in Block 1 based on Block 2’s hash value proves challenging.

Furthermore, given that Block 2 incorporates Block 1’s hash value, tampering with Block 1 would consequently alter Block 2’s hash value. This necessitates simultaneous tampering with all subsequent blocks for any modifications within the blockchain—a formidable task. Consequently, cryptographic hash functions effectively uphold the coherence and integrity of blockchain data.

In the realm of blockchain, cryptographic hash function fulfills several essential roles:

Block Linking: Each block’s header contains the previous block’s hash value, facilitating a connected chain of blocks ensuring tamper-evident integrity.
Transaction Validation: Transaction data undergoes hashing, with the resultant hash value included in the block, validating transaction authenticity and integrity.
Consensus Mechanism: Within the Proof of Work (PoW) consensus mechanism, miners must identify a nonce value meeting difficulty requirements by executing hash functions.

The Future of Cryptographic Hash Functions

On September 2, 2022, Vitalik posted a question on Twitter (X), asking which cryptographic hash function would remain secure if a quantum computer using Shor’s algorithm were to be invented.

Source: Vitalik tweet

He indicated that a quantum computer capable of utilizing Shor’s algorithm could break through RSA (a long-standing public key cryptosystem) or anything based on factorization, elliptic curves, and groups of unknown order. However, hash values (like SHA-256) fare well in the context of quantum computing, though their security would be somewhat reduced, recommending the use of longer hash values.

Conclusion

How robust are cryptographic hash functions, such as SHA-256? The “256” in SHA-256 represents 2 raised to the power of 256, a figure so vast that it is challenging to grasp concretely.

Source: 3Blue1Brown

Nonetheless, 3Blue1Brown has presented a vivid analogy to aid in comprehending the security of cryptographic hash functions: envision a scenario where 4 billion individuals on Earth each possess a computer with exceptional computing capabilities, equivalent to 1,000 times the computing power of Google worldwide. Simultaneously, picture a cosmos with 4 billion planets and 4 billion galaxies akin to the Milky Way! Even under these extreme conditions, it would take over 500 billion years before there exists a 1 in 4 billion chance of accurately guessing “the specific input required to generate the SHA-256 output hash value.”

Author: Morris

Translator: Paine

Reviewer(s): Wayne、Edward、Elisa、Ashley、Joyce

* The information is not intended to be and does not constitute financial advice or any other recommendation of any sort offered or endorsed by Gate.io.

* This article may not be reproduced, transmitted or copied without referencing Gate.io. Contravention is an infringement of Copyright Act and may be subject to legal action.