The second cryptographic method used in Bitcoin involves digital signature algorithms. Let's explore what this entails and how it works.
Bitcoins, UTXOs, and Spending Conditions
The term "wallet" in Bitcoin can be quite confusing for beginners. Indeed, what is called a Bitcoin wallet is software that does not directly hold your bitcoins, unlike a physical wallet that can hold coins or bills. Bitcoins are simply units of account. This unit of account is represented by UTXO (Unspent Transaction Outputs), which are unspent transaction outputs. If these outputs are unspent, it means they belong to a user. UTXOs are, in a way, pieces of bitcoins, of variable size, belonging to a user.
The Bitcoin protocol is distributed and operates without a central authority. Therefore, it is not like traditional banking records, where the euros that belong to you are simply associated with your personal identity. In Bitcoin, your UTXOs belong to you because they are protected by spending conditions specified in the Script language. To simplify, there are two types of scripts: the locking script (scriptPubKey), which protects a UTXO, and the unlocking script (scriptSig), which allows unlocking a UTXO and thus spending the bitcoin units it represents.
The initial operation of Bitcoin with P2PK scripts involves using a public key to lock funds, specifying in a scriptPubKey that the person wishing to spend this UTXO must provide a valid signature with the private key corresponding to this public key. To unlock this UTXO, it is therefore necessary to provide a valid signature in the scriptSig. As their names suggest, the public key is known to all since it is broadcast on the blockchain, while the private key is only known to the legitimate owner of the funds.
This is the basic operation of Bitcoin, but over time, this operation has become more complex. First, Satoshi also introduced P2PKH scripts, which use a receiving address in the scriptPubKey, which represents the hash of the public key. Then, the system became even more complex with the arrival of SegWit and then Taproot. However, the general principle remains fundamentally the same: a public key or a representation of this key is used to lock UTXOs, and a corresponding private key is required to unlock them and thus spend them.
A user who wishes to make a Bitcoin transaction must therefore create a digital signature using their private key on the transaction. The signature can be verified by other network participants. If it is valid, this means that the user initiating the transaction is indeed the owner of the private key, and therefore the owner of the bitcoins they wish to spend. Other users can then accept and propagate the transaction.
As a result, a user who owns bitcoins locked with a public key must find a way to securely store what allows unlocking their funds: the private key. A Bitcoin wallet is precisely a device that will allow you to easily keep all your keys without other people having access to them. It is therefore more like a keychain than a wallet.
The mathematical link between a public key and a private key, as well as the ability to perform a signature to prove the possession of a private key without revealing it, are made possible by a digital signature algorithm. In the Bitcoin protocol, two signature algorithms are used: ECDSA (Elliptic Curve Digital Signature Algorithm) and the Schnorr signature scheme. ECDSA is the digital signature protocol used in Bitcoin from its beginnings. Schnorr is more recent in Bitcoin, as it was introduced in November 2021 with the Taproot update.
These two algorithms are quite similar in their mechanisms. They are both based on elliptic curve cryptography. The major difference between these two protocols lies in the structure of the signature and some specific mathematical properties. We will therefore study the functioning of these algorithms, starting with the oldest: ECDSA.
Elliptic Curve Cryptography
Elliptic Curve Cryptography (ECC) is a set of algorithms that use an elliptic curve for its various mathematical and geometric properties for cryptographic purposes. The security of these algorithms relies on the difficulty of the discrete logarithm problem on elliptic curves. Elliptic curves are notably used for key exchanges, asymmetric encryption, or for creating digital signatures.
An important property of these curves is that they are symmetric with respect to the x-axis. Thus, any non-vertical line cutting the curve at two distinct points will always intersect the curve at a third point. Moreover, any tangent to the curve at a non-singular point will intersect the curve at another point. These properties will be useful for defining operations on the curve.
Here is a representation of an elliptic curve over the field of real numbers:
Every elliptic curve is defined by an equation of the form:
secp256k1
To use ECDSA or Schnorr, one must choose the parameters of the elliptic curve, that is, the values of and in the curve equation. There are different standards of elliptic curves that are reputed to be cryptographically secure. The most well-known is the secp256r1 curve, defined and recommended by the NIST (National Institute of Standards and Technology).
Despite this, Satoshi Nakamoto, the inventor of Bitcoin, chose not to use this curve. The reason for this choice is unknown, but some believe he preferred to find an alternative because the parameters of this curve could potentially contain a backdoor. Instead, the Bitcoin protocol uses the standard secp256k1 curve. This curve is defined by the parameters and . Its equation is therefore:
Its graphical representation over the field of real numbers looks like this:
However, in cryptography, we work with finite sets of numbers. More specifically, we work on the finite field , which is the field of integers modulo a prime number .
Definition: A prime number is a natural integer greater than or equal to 2 that has only two distinct positive integer divisors: 1 and itself. For example, the number 7 is a prime number since it can only be divided by 1 and 7. On the other hand, the number 8 is not prime because it can be divided by 1, 2, 4, and 8.
In Bitcoin, the prime number used to define the finite field is very large. It is chosen in such a way that the order of the field (i.e., the number of elements in ) is sufficiently large to ensure cryptographic security.
The prime number used is:
p = 0xFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFEFFFFFC2F
In decimal notation, this corresponds to:
Thus, the equation of our elliptic curve is actually:
Given that this curve is defined over the finite field , it no longer resembles a continuous curve but rather a discrete set of points. For example, here is what the curve used in Bitcoin looks like for a very small :
In this example, I have intentionally limited the finite field to for educational reasons, but one must imagine that the one used in Bitcoin is immensely larger, almost .
We use a finite field of integers modulo to ensure the accuracy of operations on the curve. Indeed, elliptic curves over the field of real numbers are subject to inaccuracies due to rounding errors during computational calculations. If numerous operations are performed on the curve, these errors accumulate and the final result can be incorrect or difficult to reproduce. The exclusive use of positive integers ensures perfect accuracy of calculations and thus reproducibility of the result.
The mathematics of elliptic curves over finite fields are analogous to those over the field of real numbers, with the adaptation that all operations are performed modulo . To simplify explanations, we will continue in the following chapters to illustrate concepts using a curve defined over real numbers, while keeping in mind that, in practice, the curve is defined over a finite field.
If you wish to learn more about the mathematical foundations of modern cryptography, I also recommend consulting this other course on Plan ₿ Academy: