Progress pill
Bitcoin Gouvernance

When shit hits the fan

Bitcoin Development Philosophy

When shit hits the fan

  • Responsible disclosure
  • Bitcoin's Traumatic childhood
  • Conclusion about When Shit Hits The Fan
Bitcoin is built by people. People write the software, and people then run this software. When a security vulnerability or a severe bug is discovered - is there really a distinction between the two? - it's always discovered by people, flesh and blood. This chapter contemplates what people do, should, and shouldn't do when shit hits the fan. The first section explains the term responsible disclosure, which refers to how someone who discovers a vulnerability can act responsibly to help minimize the damage from it. The rest of the chapter takes you on a tour through some of the most severe vulnerabilities discovered over the years, and how they were handled by developers, miners, and users. Things were not as rigorous in Bitcoin's early childhood as they are today.

Responsible disclosure

Imagine you discover a bug in Bitcoin Core, a bug that allows anyone to remotely shut down a Bitcoin Core node by using some specially crafted network messages. Imagine also you are not malicious and would like this issue to remain unexploited. What do you do? If you remain silent about it, someone else will probably discover the issue, and you can't be sure that person won't be malicious.
When a security issue is discovered, the person discovering it should employ responsible disclosure which is a term often used among Bitcoin developers. The term is explained on Wikipedia:
Developers of hardware and software often require time and resources to repair their mistakes. Often, it is ethical hackers who find these vulnerabilities. Hackers and computer security scientists have the opinion that it is their social responsibility to make the public aware of vulnerabilities. Hiding problems could cause a feeling of false security. To avoid this, the involved parties coordinate and negotiate a reasonable period of time for repairing the vulnerability. Depending on the potential impact of the vulnerability, the expected time needed for an emergency fix or workaround to be developed and applied and other factors, this period may vary between a few days and several months.
This means that if you find a security issue, you should report this to the team responsible for the system. But what does this mean in the context of Bitcoin? No one controls Bitcoin, but there's currently a focal point for Bitcoin development, namely the Bitcoin Core Github repository. The maintainers of said repository are responsible for the code in it, but they're not responsible for the system as a whole - no one is. Nevertheless, the general best practice is to send an email to [email protected].
In an email thread titled "Responsible disclosure of bugs" from 2017, Anthony Towns tried to summarize what he perceived to be the current best practices. He had collected inputs from several sources and different people to inform his view on the subject.
  • Vulnerabilities should be reported via security at bitcoincore.org
  • A critical issue (that can be exploited immediately or is already being exploited causing large harm) will be dealt with by:
    • a released patch ASAP
    • wide notification of the need to upgrade (or to disable affected systems)
    • minimal disclosure of the actual problem, to delay attacks A non-critical vulnerability (because it is difficult or expensive to exploit) will be dealt with by:
    • patch and review undertaken in the ordinary flow of development
    • backport of a fix or workaround from master to the current released version
  • Devs will attempt to ensure that publication of the fix does not reveal the nature of the vulnerability by providing the proposed fix to experienced devs who have not been informed of the vulnerability, telling them that it fixes a vulnerability, and asking them to identify the vulnerability.
  • Devs may recommend other bitcoin implementations adopt vulnerability fixes prior to the fix being released and widely deployed, if they can do so without revealing the vulnerability; eg, if the fix has significant performance benefits that would justify its inclusion.
  • Prior to a vulnerability becoming public, devs will generally recommend to friendly altcoin devs that they should catch up with fixes. But this is only after the fixes are widely deployed in the bitcoin network.
  • Devs will generally not notify altcoin developers who have behaved in a hostile manner (eg, using vulnerabilities to attack others, or who violate embargoes).
  • Bitcoin devs won't disclose vulnerability details until >80% of bitcoin nodes have deployed the fixes. Vulnerability discovers are encouraged and requested to follow the same policy. [1] [6]
This list displays how careful one must be when publishing patches for Bitcoin, since the patch itself might give away the vulnerability. The fourth bullet is particularly interesting as it explains how to test whether a patch has been disguised well enough. Indeed, if a few really experienced developers can't spot the vulnerability even knowing that the patch fixes one, it will probably be really hard for others to discover it.
The thread that led to this email was discussing whether, when, and how to disclose vulnerabilities to altcoins and other implementations of Bitcoin. There is no clear answer here. "Helping the good guys" seems like the sensible thing to do, but who decides who they are and where does one draw the line? Bryan Bishop argued that helping altcoins and even scamcoins defend themselves against security exploits was a moral duty:
It's not enough to defend bitcoin and its users from active threats, there is a more general responsibility to defend all kinds of users and different software from many kinds of threats in whatever forms, even if folks are using stupid and insecure software that you personally don't maintain or contribute to or advocate for. Handling knowledge of a vulnerability is a delicate matter and you might be receiving knowledge with more serious direct or indirect impact than originally described.
Also leading up to Town's email above was a post by Gregory Maxwell, in which he argued that security vulnerabilities could be more severe than they appear:
I've multiple time seen a hard to exploit issue turn out to be trivial when you find the right trick, or a minor dos issue turn our to far more serious. Simple performance bugs, expertly deployed, can potentially be used to carve up the network--- miner A and exchange B go in one partition, everyone else in another.. and doublespend. And so on. So while I absolutely do agree that different things should and can be handled differently, it is not always so clear cut. It's prudent to treat things as more severe than you know them to be.
So, even if a vulnerability seems hard to exploit, it might be best to assume that it's easily exploitable and you just haven't figured out how yet.
He also mentions how "it's somewhat incorrect to call this thread anything about disclosure, this thread is not about disclosure. Disclosure is when you tell the vendor. This thread is about publication and that has very different implications. Publication is when you're sure you've told the prospective attackers". This last observation concerning the distinction between disclosure and publication is an important one. The easy part is responsible disclosure; the hard part is sensible publishing.

Bitcoin's Traumatic childhood

Bitcoin started out as a one-man (at least that's what its creator's pseudonym suggests) project, and bitcoin had initially little to no value. As such, vulnerabilities and bug fixes were not as rigorously handled as they are today.
The Bitcoin wiki has a list of common vulnerabilities and exposures (CVEs) that Bitcoin has gone through. This section constitutes a little exposé of some of the security issues and incidents from the early years of Bitcoin. We won't cover them all, but we selected a few that we find especially interesting.

2010-07-28: Spend anyone's coins (CVE-2010-5141)

On July 28, 2010, a pseudonymous person by the name ArtForz discovered a bug in version 0.3.4 that would let anyone take coins from anyone else. ArtForz responsibly reported this to Satoshi Nakamoto and to another Bitcoin developer named Gavin Andresen.
The problem was that the script operator OP_RETURN would simply exit the program execution, so if the scriptPubKey was <pubkey> OP_CHECKSIG and scriptSig was OP_1 OP_RETURN, the part of the program in the scriptPubKey would never execute. The only thing that would happen would be for 1 to be put on the stack and then OP_RETURN would cause the program to exit. Any non-zero value on top of the stack after the program has executed means that the spending condition is fulfilled. Since the top stack element 1 is non-zero, the spending would be OK.
This was the code for handling of OP_RETURN:
case OP_RETURN: { pc = pend; } break;
The effect of pc = pend; was for the rest of the program to get skipped, meaning that any locking script in scriptPubKey would be ignored. The fix consisted in changing the meaning of OP_RETURN so that it immediately failed, instead.
case OP_RETURN: { return false; } break;
Satoshi made this change locally and built an executable binary with version 0.3.5 from it. Then he posted on Bitcointalk forum \\*** ALERT \*** Upgrade to 0.3.5 ASAP, urging users to install this binary version of his, without presenting the source code for it:
Please upgrade to 0.3.5 ASAP! We fixed an implementation bug where it was possible that bogus transactions could be accepted. Do not accept Bitcoin transactions as payment until you upgrade to version 0.3.5!
The original message was later edited and is no longer available in its full form. The above snippet is from a quoting answer. Some users tried Satoshi's binary, but ran into issues with it. Shortly after, Satoshi wrote:
Haven't had time to update the SVN yet. Wait for 0.3.6, I'm building it now. You can shut down your node in the meantime.
And 35 minutes later, he wrote:
SVN is updated with version 0.3.6. Uploading Windows build of 0.3.6 to Sourceforge now, then will rebuild linux.
At this point he also seemed to have updated the original post to mention 0.3.6 instead of 0.3.5:
Please upgrade to 0.3.6 ASAP! We fixed an implementation bug where it was possible that bogus transactions could be displayed as accepted. Do not accept Bitcoin transactions as payment until you upgrade to version 0.3.6! If you can't upgrade to 0.3.6 right away, it's best to shut down your Bitcoin node until you do. Also in 0.3.6, faster hashing:
  • midstate cache optimisation thanks to tcatm
  • Crypto++ ASM SHA-256 thanks to BlackEye Total generating speedup 2.4x faster.
Download: http://sourceforge.net/projects/bitcoin/files/Bitcoin/bitcoin-0.3.6/ Windows and Linux users: if you got 0.3.5 you still need to upgrade to 0.3.6.
Note the difference in the characterization of the problem from the first message: "could be displayed as accepted" vs "could be accepted". Maybe Satoshi downplayed the severity of the bug in his communication so as not to draw too much attention to the actual issue. Anyhow, people upgraded to 0.3.6 and it worked as expected. This particular issue was resolved, amazingly, with no bitcoin losses.
Satoshi's message also described some performance optimization for mining. It's unclear why that was included in a critical security fix, it's possible that the purpose was to obfuscate the real issue. However, it seems more likely that he just released whatever was on the head of the development branch of the Subversion repository, with the security fix added to it.
At that time, there weren't nearly as many users as there are today, and bitcoin's value was close to zero. If this bug response was played out today, it would be considered a complete shit-show for multiple reasons:
  • Satoshi made a binary-only release of 0.3.5 containing the fix. No patch or code was provided, maybe as a measure to obfuscate the issue.
  • 0.3.5 didn't even work.
  • The fix in 0.3.6 was actually a hard fork.
Another debatable thing is whether it's good or bad that users were asked to shut down their nodes. This wouldn't be doable today, but at that time lots of users were actively following the forums for updates and were usually on top of things. Given that it was possible to do this, it might have been a sensible thing to do.

2010-08-15 Combined output overflow (CVE-2010-5139)

In mid-August 2010, Bitcointalk forum user jgarzik, a.k.a. Jeff Garzik, discovered that a certain transaction at block height 74638 had two outputs of unusually high value:
"out" : [ { "value" : 92233720368.54277039, "scriptPubKey" : "OP_DUP OP_HASH160 0xB7A73EB128D7EA3D388DB12418302A1CBAD5E890 OP_EQUALVERIFY OP_CHECKSIG" }, { "value" : 92233720368.54277039, "scriptPubKey" : "OP_DUP OP_HASH160 0x151275508C66F89DEC2C5F43B6F9CBE0B5C4722C OP_EQUALVERIFY OP_CHECKSIG" } ]
The "value out" in this block #74638 is quite strange: 92233720368.54277039 BTC? Is that UINT64_MAX, I wonder?
Presumably, there was a bug causing two int64 (not uint64, as Garzik supposed) outputs' sum to overflow to a negative value -0.00997538 BTC. Whatever the sum of the inputs, the "sum" of the outputs would be smaller, making this transaction OK according to the code at the time.
In this case, the bug had been disclosed and published through an actual exploit. An unfortunate outcome of this was that about 2x92 billion bitcoin had been created, which severely diluted the money supply of around 3.7 million coins that existed at that time.
In a related thread, Satoshi posted that he'd appreciate it if people stopped mining (or generating, as they called it back then):
It would help if people stop generating. We will probably need to re-do a branch around the current one, and the less you generate the faster that will be. A first patch will be in SVN rev 132. It's not uploaded yet. I'm pushing some other misc changes out of the way first, then I'll upload the patch for this.
His plan was to make a soft fork to make transactions like the one discussed here invalid, thus invalidating the blocks (especially block 74638) that contained such transactions. Less than an hour later, he committed a patch in revision 132 of the Subversion repository and posted to the forum describing what he thought users should do:
Patch is uploaded to SVN rev 132! For now, recommended steps:
  1. Shut down.
  2. Download knightmb's blk files. (replace your blk0001.dat and blkindex.dat files)
  3. Upgrade.
  4. It should start out with less than 74000 blocks. Let it redownload the rest.
If you don't want to use knightmb's files, you could just delete your blk*.dat files, but it's going to be a lot of load on the network if everyone is downloading the whole block index at once. I'll build releases shortly.
He wanted people to download block data from a specific user, namely knightmb, who had published his blockchain as it appeared on his disk, the files blkXXXX.dat and blkindex.dat. The reason for downloading the blockchain data this way, as opposed to synchronizing from scratch, was to reduce network bandwidth bottlenecks.
There was a big caveat with this: the data users would download from knightmb weren't verified by the Bitcoin software at startup. The blkindex.dat file contained the UTXO set, and the software would accept any data therein as if it had already verified it. knightmb could have manipulated the data to give himself or anyone else some bitcoins.
Again, people seemed to go along with this, and the reversal of the invalid block and its successors was successful. Miners started working on a new successor to block 74637 and, according to the block's timestamp, a successor appeared at 23:53 UTC, about 6 hours after the issue was discovered. At 08:10 the following day, on August 16, around block 74689, the new chain had overtaken the old chain, therefore all non-upgraded nodes reorged to follow the new chain. This is the deepest reorg - 52 blocks - in Bitcoin's history.
Compared to the OP_RETURN issue, this issue was handled in a somewhat cleaner way:
  • No binary-only patch release
  • The released software worked as intended
  • No hard fork
Users were asked to stop mining during this issue as well. We can discuss whether this is a good idea or not, but imagine you're a miner and you're convinced that any blocks on top of the bad block will eventually get wiped out in a deep reorg: why would you waste resources on mining doomed blocks?
You might also think that it's a bit fishy to do as suggested by Nakamoto and download the blockchain, including the UTXO set, from a random dude's hard drive. If so, you're right: that is fishy. But, given the circumstances, this emergency response was a sensible one.
There's an important difference between this case and the previous OP_RETURN case: this issue was exploited in the wild, and thus a fix could be made more straightforward. In the case of OP_RETURN, they had to obfuscate the fix and make public statements that didn't directly reveal what the issue was.

2013-03-11 DB locks issue 0.7.2 - 0.8.0 (CVE-2013-3220)

A very interesting an educationally valuable issue surfaced in March 2013. It appeared that the blockchain had split (although the word "fork" is used in the quote below) after block 225429. The details of this incident were reported in BIP50. The summary says:
A block that had a larger number of total transaction inputs than previously seen was mined and broadcasted. Bitcoin 0.8 nodes were able to handle this, but some pre-0.8 Bitcoin nodes rejected it, causing an unexpected fork of the blockchain. The pre-0.8-incompatible chain (from here on, the 0.8 chain) at that point had around 60% of the mining hash power ensuring the split did not automatically resolve (as would have occurred if the pre-0.8 chain outpaced the 0.8 chain in total work, forcing 0.8 nodes to reorganise to the pre-0.8 chain). In order to restore a canonical chain as soon as possible, BTCGuild and Slush downgraded their Bitcoin 0.8 nodes to 0.7 so their pools would also reject the larger block. This placed majority hashpower on the chain without the larger block, thus eventually causing the 0.8 nodes to reorganise to the pre-0.8 chain.
The quick action that the mining pools BTCGuild and Slush took was imperative in this emergency. They were able to tip the majority of the hash power over to the pre-0.8 branch of the split, and thus help restore consensus. This gave developers the time to figure out a sustainable fix.
What's also very interesting in this issue is that version 0.7.2 was incompatible with itself, as was the case with prior versions too. This is explained in the Root cause section of BIP50:
With the insufficiently high BDB lock configuration, it implicitly had become a network consensus rule determining block validity (albeit an inconsistent and unsafe rule, since the lock usage could vary from node to node).
In short, the issue is that the number of database locks the Bitcoin Core software needs to verify a block is not deterministic. One node might need X locks while another node might need X+1 locks. The nodes also have a limit on how many locks Bitcoin can take. If the number of locks needed exceeds the limit, the block will be considered invalid. So if X+1 exceeds the limit but not X, then the two nodes will split the blockchain and disagree on which branch is valid.
The solution chosen, apart from the immediate actions taken by the two pools to restore consensus, was to
  • limit the blocks in terms of both size and locks needed on version 0.8.1
  • patch old versions (0.7.2 and some older ones) with the same new rules, and increase the global lock limit.
Except for the increased global lock limit in the second bullet, these rules were implemented temporarily for a pre-determined amount of time. The plan was to remove these limits once most nodes had upgraded.
This soft fork dramatically reduced the risk of consensus failure, and a few months later, on May 15, the temporary rules were deactivated in concert across the network. Note that this deactivation was in effect a hard fork, but it was not contentious. Furthermore, it was released along with the preceding soft fork, so people running the soft-forked software were well aware that a hard fork would follow it. Therefore, the vast majority of nodes remained in consensus when the hard fork got activated. Unfortunately, though, a few nodes that didn't upgrade were lost in the process.
One might wonder if this would be doable today. The mining landscape is more complex today, and, depending on the hash power on each side of the split, it might be hard to roll out a patch such as the one in BIP50 quickly enough. It'd probably be hard to convince miners on the "wrong" branch to let go of their block rewards.

BIP66

BIP66 is interesting because it highlights the importance of:
  • good selection cryptography
  • responsible disclosure
  • deployment without revealing the vulnerability
  • mining on top of verified blocks
BIP66 was a proposal to tighten up the rules for signature encodings in Bitcoin Script. The motivation was to be able to parse signatures with software or libraries other than OpenSSL and even recent versions of OpenSSL. OpenSSL is a library for general purpose cryptography that Bitcoin Core used at that time.
The BIP activated on July 4, 2015. However, while the above is true, BIP66 also fixes a much more severe issue not mentioned in the BIP.
The vulnerability
The full disclosure of this issue was published on July 28 2015 by Pieter Wuille in an email to the Bitcoin-dev mailing list:
Hello all, I'd like to disclose a vulnerability I discovered in September 2014, which became unexploitable when BIP66's 95% threshold was reached earlier this month. Short description: A specially-crafted transaction could have forked the blockchain between nodes:
  • using OpenSSL on a 32-bit systems and on 64-bit Windows systems
  • using OpenSSL on non-Windows 64-bit systems (Linux, OSX, ...)
  • using some non-OpenSSL codebases for parsing signatures
The email further lays out the details about how the issue got discovered and more exactly what caused it. At the end, he submits a timeline of the events, and we will replay some of the most important ones here. Some of them have, as illustrated by figure above, already been described.
Here above, you can check the timeline of the events surrounding BIP66. Items in black have been explained above.
Before discovery
Without anyone knowing about the issue, it could have been resolved by the now widthdrawn BIP62, which was a proposal to reduce the possibilities of transaction malleability. Among the proposed changes in BIP62 were tightening of the consensus rules for the encoding of signatures, or "strict DER encoding". Pieter Wuille proposed some tweaks to the BIP in July 2014, that would have solved the issue:
2014-Jul-18: In order to make Bitcoin's signature encoding rules not depend on OpenSSL's specific parser, I modified the BIP62 proposal to have its strict DER signatures requirement also apply to version 1 transactions. No non-DER signatures were being mined into blocks anymore at the time, so this was assumed to not have any impact. See https://github.com/bitcoin/bips/pull/90 and http://lists.linuxfoundation.org/pipermail/bitcoin-dev/2014-July/006299.html. Unknown at the time, but if deployed this would have solved the vulnerability.
Due to the breadth of this BIP, which covered substantially more than just "strict DER encoding", it was constantly changing and never got near deployment. The BIP was later withdrawn because Segregated Witness, BIP141, solved transaction malleability in a different and more complete way.
After discovery
OpenSSL released new versions of their software with patches that, if used in Bitcoin since the beginning, would have solved the issue. However, using any new version of OpenSSL only in a new release of Bitcoin Core would make matters worse. Gregory Maxwell explains this in another email thread in January 2015:
While for most applications it is generally acceptable to eagerly reject some signatures, Bitcoin is a consensus system where all participants must generally agree on the exact validity or invalidity of the input data. In a sense, consistency is more important than "correctness". [...] The patches above, however, only fix one symptom of the general problem: relying on software not designed or distributed for consensus use (in particular OpenSSL) for consensus-normative behavior. Therefore, as an incremental improvement, I propose a targeted soft-fork to enforce strict DER compliance soon, utilizing a subset of BIP62.
He points out that using code that's not intended for use in consensus systems poses serious risks, and proposes that Bitcoin implements strict DER encoding. This is a very clear example of the importance of good selection cryptography.
These events might give you the impression that Gregory Maxwell knew about the vulnerability Pieter Wuille later published, but wanted to help sneak in a fix disguised as a precaution measure, without drawing too much attention to the actual problem. It might be so, but it's purely speculation.
Then, as proposed by Maxwell, BIP66 was created as a subset of BIP62 that specified only strict DER encoding. This BIP was apparently broadly accepted and deployed in July, albeit two blockchain splits ironically occurred due to validationless mining. These splits are discussed in the next section.
A key takeaway from this is that BIPs should be more or less atomic, meaning that they should be complete enough to provide something useful or solve a specific problem, but small enough to allow for broad support among users. The more stuff you put into a BIP, the smaller the chance of acceptance.
Splits due to validationless mining
Unfortunately, the story of BIP66 didn't end there. When BIP66 was activated, it turned out quite messy because some miners didn't verify the blocks they were trying to extend. This is called validationless mining, or SPV-mining (as in Simplified Payment Verification). An alert message was sent out to Bitcoin nodes with a link to a web page describing the issue:
Early morning on 4 July 2015, the 950/1000 (95%) threshold was reached. Shortly thereafter, a small miner (part of the non-upgraded 5%) mined an invalid block–as was an expected occurrence. Unfortunately, it turned out that roughly half the network hash rate was mining without fully validating blocks (called SPV mining), and built new blocks on top of that invalid block.
The alert page instructed people to wait for 30 additional confirmations than they normally would in case they were using older versions of Bitcoin Core.
The split mentioned above occurred on 2015-07-04 at 02:10 UTC after block height 363730. This issue got resolved at 03:50 the same day, after 6 invalid blocks had been mined. Unfortunately, the same issue happened again the next day, i.e. on 2015-07-05 at 21:50, but this time the invalid branch only lasted 3 blocks.
The events that led up to BIP66, its deployment, and the aftermath are a very good case study for how careful Bitcoin developers have to be. A few key takeaways from BIP66:
  • The balance between openness and not publishing a vulnerability is a delicate one.
  • Deploying fixes for non-published vulnerabilities is a tricky game to play.
  • Retaining consensus is hard.
  • Software not intended for consensus systems are generally risky.
  • BIPs should be somewhat atomic.

Conclusion about When Shit Hits The Fan

Bitcoin has bugs. People discovering bugs are encouraged to disclose them responsibly to Bitcoin developers, so they can fix the bug without revealing it publicly. Ideally, the bug fix can be disguised as a performace improvement, or some other smoke screen.
We've looked at some of the more severe issues that's surfaced through the years, and how they were handled. Some were discovered publicly through exploits while other were responisibly disclosed and could be fixed before malicious actors had a chance to exploit them.
Quiz
Quiz1/5
What is the primary reason that performance bugs in Bitcoin can potentially be weaponized to create network partitions according to security experts?