We have previously seen that hashing functions possess important characteristics that justify their use in Bitcoin. Let's now examine the internal mechanisms of these hashing functions that give them these properties, and to do this, I propose to dissect the operation of SHA256.
The SHA256 and SHA512 functions belong to the same SHA2 family. Their mechanism is based on a specific construction called Merkle-Damgård construction. RIPEMD160 also uses this same type of construction.
As a reminder, we have a message of arbitrary size as input to SHA256, and we will pass it through the function to obtain a 256-bit hash as output.
Pre-processing of the input
To begin, we need to prepare our input message so that it has a standard length that is a multiple of 512 bits. This step is crucial for the proper functioning of the algorithm subsequently.
To do this, we start with the padding bits step. We first add a separator bit 1 to the message, followed by a certain number of 0 bits. The number of 0 bits added is calculated so that the total length of the message after this addition is congruent to 448 modulo 512. Thus, the length of the message with the padding bits is equal to:
, for modulo, is a mathematical operation that, between two integers, returns the remainder of the Euclidean division of the first by the second. For example: . It is an operation widely used in cryptography.
Here, the padding step ensures that, after adding the 64 bits in the next step, the total length of the equalized message will be a multiple of 512 bits. If the initial message has a length of bits, the number () of 0 bits to be added is thus:
For example, if the initial message is 950 bits, the calculation would be as follows:
Thus, we would have 9 0s in addition to the separator 1. Our padding bits to be added directly after our message would thus be:
1000 0000 00
After adding the padding bits to our message , we also add a 64-bit representation of the original length of the message , expressed in binary. This allows the hash function to be sensitive to the order of bits and the length of the message.
If we go back to our example with an initial message of 950 bits, we convert the decimal number 950 into binary, which gives us 1110 1101 10. We complete this number with zeros at the base to make a total of 64 bits. In our example, this gives:
This padding size is added following the bit padding. Therefore, the message after our preprocessing consists of three parts:
The original message ;
A bit 1 followed by several bits 0 to form the bit padding;
A 64-bit representation of the length of to form the padding with the size.
Initialization of Variables
SHA256 uses eight initial state variables, denoted to , each of 32 bits. These variables are initialized with specific constants, which are the fractional parts of the square roots of the first eight prime numbers. We will use these values subsequently during the hashing process:
SHA256 also uses 64 other constants, denoted to , which are the fractional parts of the cube roots of the first 64 prime numbers:
Division of the Input
Now that we have an equalized input, we will now move on to the main processing phase of the SHA256 algorithm: the compression function. This step is very important, as it is primarily what gives the hash function its cryptographic properties that we studied in the previous chapter.
First, we start by dividing our equalized message (result of the preprocessing steps) into several blocks of 512 bits each. If our equalized message has a total size of bits, we will therefore have blocks, each of 512 bits. Each 512-bit block will be processed individually by the compression function, which consists of 64 rounds of successive operations. Let's name these blocks , , ...
Logical Operations
Before exploring the compression function in detail, it is important to understand the basic logical operations used in it. These operations, based on Boolean algebra, operate at the bit level. The basic logical operations used are:
Conjunction (AND): denoted , corresponds to a logical "AND".
Disjunction (OR): denoted , corresponds to a logical "OR".
Negation (NOT): denoted , corresponds to a logical "NOT".
From these basic operations, we can define more complex operations, such as the "Exclusive OR" (XOR) denoted , which is widely used in cryptography.
Every logical operation can be represented by a truth table, which indicates the result for all possible combinations of binary input values (two operands and ).
For XOR ():
0
0
0
0
1
1
1
0
1
1
1
0
For AND ():
0
0
0
0
1
0
1
0
0
1
1
1
For NOT ():
0
1
1
0
Let's take an example to understand the operation of XOR at the bit level. If we have two binary numbers on 6 bits:
Then:
By applying XOR bit by bit:
Bit Position
1
1
0
1
2
0
0
0
3
1
1
0
4
1
0
1
5
0
0
0
6
0
0
0
The result is therefore .
In addition to logical operations, the compression function uses bit-shifting operations, which will play an essential role in the diffusion of bits in the algorithm.
First, there is the logical right shift operation, denoted , which shifts all the bits of to the right by positions, filling the vacant bits on the left with zeros.
For example, for (on 9 bits) and :
Schematically, the right shift operation could be seen like this:
Another operation used in SHA256 for bit manipulation is the right circular rotation, denoted , which shifts the bits of to the right by positions, reinserting the shifted bits at the beginning of the string.
For example, for (over 9 bits) and :
Schematically, the right circular shift operation could be seen like this:
Compression Function
Now that we have understood the basic operations, let's examine the SHA256 compression function in detail.
In the previous step, we divided our input into several 512-bit pieces . For each 512-bit block , we have:
The message words : for from 0 to 63.
The constants : for from 0 to 63, defined in the previous step.
The state variables : initialized with the values from the previous step.
The first 16 words, to , are directly extracted from the processed 512-bit block . Each word consists of 32 consecutive bits from the block. So, for example, we take our first piece of input , and we further divide it into smaller 32-bit pieces that we call words.
The next 48 words ( to ) are generated using the following formula:
With:
In this case, equals for and for .
Once we have determined all the words for our 512-bit piece, we can move on to the compression function, which consists of performing 64 rounds.
For each round from 0 to 63, we have three different types of inputs. First, the that we have just determined, partly consisting of our message piece . Next, the 64 constants . Finally, we use the state variables , , , , , , , and , which will evolve throughout the hashing process and be modified with each compression function. However, for the first piece , we use the initial constants given previously.
We then perform the following operations on our inputs:
Function :
Function :
Function ("Choose"):
Function ("Majority"):
We then calculate 2 temporary variables:
:
:
Next, we update the state variables as follows:
The following diagram represents a round of the SHA256 compression function as we have just described:
The arrows indicate the flow of data;
The boxes represent the operations performed;
The surrounded represent the addition modulo .
We can already observe that this round outputs new state variables , , , , , , , and . These new variables will serve as input for the next round, which will in turn produce new variables , , , , , , , and , to be used for the following round. This process continues up to the 64th round.
After the 64 rounds, we update the initial values of the state variables by adding them to the final values at the end of round 64:
These new values of , , , , , , , and will serve as the initial values for the next block, . For this block , we replicate the same compression process with 64 rounds, then we update the variables for block , and so on until the last block of our equalized input.
After processing all the message blocks, we concatenate the final values of the variables , , , , , , , and to form the final 256-bit hash of our hashing function:
Each variable is a 32-bit integer, so their concatenation always yields a 256-bit result, regardless of the size of our message input to the hashing function.
Justification of Cryptographic Properties
But then, how is this function irreversible, collision-resistant, and tamper-resistant?
For tamper resistance, it's quite simple to understand. There are so many calculations performed in cascade, which depend both on the input and the constants, that the slightest modification of the initial message completely changes the path taken, and thus completely changes the output hash. This is what is called the avalanche effect. This property is partly ensured by the mixing of the intermediate states with the initial states for each piece.
Next, when discussing a cryptographic hash function, the term "irreversibility" is not generally used. Instead, we talk about "preimage resistance," which specifies that for any given , it is difficult to find an such that . This preimage resistance is guaranteed by the algebraic complexity and the strong non-linearity of the operations performed in the compression function, as well as by the loss of certain information in the process. For example, for a given result of an addition modulo, there are several possible operands:
In this example, knowing only the modulo used (10) and the result (5), one cannot determine with certainty which are the correct operands used in the addition. It is said that there are multiple congruences modulo 10.
For the XOR operation, we are faced with the same problem. Remember the truth table for this operation: any 1-bit output can be determined by two different input configurations that have exactly the same probability of being the correct values. Therefore, one cannot determine with certainty the operands of an XOR by knowing only its result. If we increase the size of the XOR operands, the number of possible inputs knowing only the result increases exponentially. Moreover, XOR is often used alongside other bit-level operations, such as the operation, which add even more possible interpretations to the result.
The compression function also uses the operation. This operation removes a part of the basic information, which is then impossible to retrieve later. Once again, there is no algebraic means to reverse this operation. All these one-way and information-loss operations are used very frequently in compression functions. The number of possible inputs for a given output is thus almost infinite, and each attempt at reverse calculation would lead to equations with a very high number of unknowns, which would increase exponentially at each step.
Finally, for the characteristic of collision resistance, several parameters come into play. The pre-processing of the original message plays an essential role. Without this pre-processing, it might be easier to find collisions on the function. Although, theoretically, collisions exist (due to the pigeonhole principle), the structure of the hash function, combined with the aforementioned properties, makes the probability of finding a collision extremely low.
For a hash function to be collision-resistant, it is essential that:
The output is unpredictable: Any predictability can be exploited to find collisions faster than with a brute force attack. The function ensures that each bit of the output depends in a non-trivial way on the input. In other words, the function is designed so that each bit of the final result has an independent probability of being 0 or 1, even if this independence is not absolute in practice.
The distribution of hashes is pseudo-random: This ensures that the hashes are uniformly distributed.
The size of the hash is substantial: the larger the possible space for results, the more difficult it is to find a collision.
Cryptographers design these functions by evaluating the best possible attacks to find collisions, then adjusting the parameters to render these attacks ineffective.
Merkle-Damgård Construction
The structure of SHA256 is based on the Merkle-Damgård construction, which allows transforming a compression function into a hash function that can process messages of arbitrary length. This is precisely what we have seen in this chapter.
However, some old hash functions like SHA1 or MD5, which use this specific construction, are vulnerable to length extension attacks. This is a technique that allows an attacker who knows the hash of a message and the length of (without knowing the message itself) to calculate the hash of a message formed by concatenating with additional content.
SHA256, even though it uses the same type of construction, is theoretically resistant to this type of attack, unlike SHA1 and MD5. This might explain the mystery of the double hashing implemented throughout Bitcoin by Satoshi Nakamoto. To avoid this type of attack, Satoshi might have preferred to use a double SHA256:
This enhances security against potential attacks related to the Merkle-Damgård construction, but it does not increase the hashing process's security in terms of collision resistance. Moreover, even if SHA256 had been vulnerable to this type of attack, it would not have had a serious impact, as all use cases of hash functions in Bitcoin involve public data. However, the length extension attack might only be useful for an attacker if the hashed data are private and the user has used the hash function as an authentication mechanism for these data, similar to a MAC. Thus, the implementation of double hashing remains a mystery in the design of Bitcoin.
Now that we have looked in detail at the workings of hash functions, particularly SHA256, which is used extensively in Bitcoin, we will focus more specifically on the cryptographic derivation algorithms used at the application level, especially for deriving the keys for your wallet.
Quiz
Quiz
cyp2012.2
1/5
What is the main difference between ShR and RotR operations?