Progress pill
Bitcoin Central Values

Privacy

  • What does privacy mean?
  • Why is privacy important?
  • Pseudonymity
  • Blockchain privacy
  • Non-blockchain privacy
  • Fungibility
  • Privacy measures
  • Conclusion about Privacy
This chapter deals with how to keep your private financial information to yourself. It explains what privacy stands for in the context of Bitcoin, why it's important, and what it means to say that Bitcoin is pseudonymous. It also looks into how private data can leak, both on-chain and off-chain.
Then, it talks about the fact that bitcoins should be fungible, meaning interchangeable for any other bitcoins, and how fungibility and privacy go hand in hand. Lastly, the chapter introduces some measures you can take to improve your privacy and that of others.
Bitcoin can be described as a pseudonymous system, where users have multiple pseudonyms in the form of public keys. At first glance, this looks like a pretty good way to protect users from being identified, but it is in fact really easy to leak private financial information unintentionally.

What does privacy mean?

Privacy can mean different things in different contexts. In Bitcoin, it generally means that users don't have to reveal their financial information to others, unless they voluntarily do so.
There are many ways in which you may leak your private information to others, with or without knowing it. Data can either leak from the public blockchain or through other means, for example when malicious actors intercept your internet communications.

Why is privacy important?

It may seem obvious why privacy is important in Bitcoin, but there are some aspects of it that one might not immediately think about. On the Bitcoin Talk forum, Gregory Maxwell walks us through a lot of good reasons why he thinks privacy matters. Among them are free market, safety, and human dignity:
Financial privacy is an essential criteria for the efficient operation of a free market: if you run a business, you cannot effectively set prices if your suppliers and customers can see all your transactions against your will. You cannot compete effectively if your competition is tracking your sales. Individually your informational leverage is lost in your private dealings if you don't have privacy over your accounts: if you pay your landlord in Bitcoin without enough privacy in place, your landlord will see when you've received a pay raise and can hit you up for more rent. Financial privacy is essential for personal safety: if thieves can see your spending, income, and holdings, they can use that information to target and exploit you. Without privacy malicious parties have more ability to steal your identity, snatch your large purchases off your doorstep, or impersonate businesses you transact with towards you... they can tell exactly how much to try to scam you for. Financial privacy is essential for human dignity: no one wants the snotty barista at the coffee shop or their nosy neighbors commenting on their income or spending habits. No one wants their baby-crazy in-laws asking why they're buying contraception (or sex toys). Your employer has no business knowing what church you donate to. Only in a perfectly enlightened discrimination free world where no one has undue authority over anyone else could we retain our dignity and make our lawful transactions freely without self-censorship if we don't have privacy.
Maxwell also touches on fungibility, which will be discussed later in this chapter, as well as on how privacy and law enforcement are not contradictory.

Pseudonymity

We mentioned above that Bitcoin is pseudonymous, and that the pseudonyms are public keys. In the media you often hear that Bitcoin is anonymous, which is not correct. There is a distinction between anonymity and pseudonymity.
Andrew Poelstra explains in a Bitcoin Stack Exchange post what anonymity would look like in transactions:
Total anonymity, in the sense that when you spend money there is no trace of where it came from or where it's going, is theoretically possible by using the cryptographic technique of zero-knowledge proofs.
The difference seems to be that in a pseudonymous form of money you can trace payments between pseudonyms, whereas in an anonymous form of money you can't. Since bitcoin payments are traceable between pseudonyms, it's not an anonymous system.
We have also said that the pseudonyms are public keys, but it's actually addresses derived from public keys. Why do we use addresses as pseudonyms and not something else, for example some descriptive names, like "watchme1984"? This has been explained well by user Tim S., also on Bitcoin Stack Exchange:
In order for Bitcoin's idea to work, you must have coins that can only be spent by the owner of a given private key. This means that whatever you send to must be tied, in some way, to a public key. Using arbitrary pseudonyms (e.g. user names) would mean that you'd have to then somehow link the pseudonym to a public key in order to enable public/private key crypto. This would remove the ability to securely create addresses/pseudonyms offline (e.g. before someone could send money to the user name "tdumidu", you'd have to announce in the blockchain that "tdumidu" is owned by public key "a1c...", and include a fee so others have a reason to announce it), reduce anonymity (by encouraging you to reuse pseudonyms), and needlessly bloat the size of the blockchain. It would also create a false sense of security that you're sending to who you think you are (if I take the name "Linus Torvalds" before he does, then it's mine and people might send money thinking they're paying the creator of Linux, not me).
By using addresses, or public keys, we achieve important goals, such as removing the need to somehow register a pseudonym beforehand, reducing the incentives for pseudonym reuse, avoiding blockchain bloat, and making it harder to impersonate other people.

Blockchain privacy

Blockchain privacy refers to the information you disclose by transacting on the blockchain. It applies to all transactions, the ones you send as well as the ones you receive.
Satoshi Nakamoto ponders over on-chain privacy in section 7 of his Bitcoin whitepaper:
As an additional firewall, a new key pair should be used for each transaction to keep them from being linked to a common owner. Some linking is still unavoidable with multi-input transactions, which necessarily reveal that their inputs were owned by the same owner. The risk is that if the owner of a key is revealed, linking could reveal other transactions that belonged to the same owner.
The paper summarizes the main problems of blockchain privacy, namely address reuse and address clustering. The first is self-explaining, the latter refers to the ability to decide, with some level of certainty, that a set of different addresses belongs to the same user.
Chris Belcher wrote in great detail about the different kinds of privacy leaks that can happen on the Bitcoin blockchain. We recommend you read at least the first few subsections under "Blockchain attacks on privacy."
The takeaway is that privacy in Bitcoin isn't perfect. It requires a significant amount of work to transact privately. Most people aren't prepared to go that far for privacy. There seems to be a clear trade-off between privacy and usability.
Another important aspect of privacy is that the measures you take to protect your own privacy affect other users as well. If you are sloppy with your own privacy, other people might experience reduced privacy, too. Gregory Maxwell explains this very plainly on the same Bitcoin Talk discussion that we linked above, and concludes with an example:
This actually works in practice, too... A nice whitehat hacker on IRC was playing around with brainwallet cracking and hit a phrase with ~250 BTC in it. We were able to identify the owner from just the address alone, because they'd been paid by a Bitcoin service that reused addresses and he was able to talk them into giving up the users contact information. He actually got the user on the phone, they were shocked and confused— but grateful to not be out their coin. A happy ending there. (This isn't the only example of it, by far ... but its one of the more fun ones).
In this case, it all went well thanks to the philanthropically-minded hacker, but don't count on that next time.

Non-blockchain privacy

While the blockchain proves to be a notorious source of privacy leaks, there are plenty of other leaks that don't use the blockchain, some sneakier than others. These range from key-loggers to network traffic analysis. To read up on some of these methods, please refer again to Chris Belcher's piece, specifically the section "Non-blockchain attacks on privacy".
Among a plethora of attacks, Belcher mentions the possibility of someone snooping on your internet connection, for example, your ISP:
If the adversary sees a transaction or block coming out of your node which did not previously enter, then it can know with near-certainty that the transaction was made by you or the block was mined by you. As internet connections are involved, the adversary will be able to link the IP address with the discovered bitcoin information.
However, among the most obvious privacy leaks are exchanges. Due to laws, usually referred to as KYC (Know Your Customer) and AML (Anti-Money Laundering), that are valid in the jurisdictions they operate in, exchanges and related companies often have to collect personal data about their users, building up big databases about which users own which bitcoins. These databases are great honeypots for evil governments and criminals who are always on the lookout for new victims. There are actual markets for this kind of data, where hackers sell data to the highest bidder.
To make things worse, the companies that manage these databases often have little experience with protecting financial data, in fact many of them are young start-ups, and we know for a fact that several leaks have already occurred. A few examples are India-based MobiQwik and HubSpot.
Again, protecting data against this wide range of attacks is hard, and it is likely that you won't be fully able to do so. You'll have to opt for the trade-off between convenience and privacy that works best for you.

Fungibility

Fungibility, in the context of currencies, means that one coin is interchangeable for any other coin of the same currency. This funny word was briefly touched upon earlier in the chapter.
In the article discussed there, Gregory Maxwell stated:
Financial privacy is an essential element to fungibility in Bitcoin: if you can meaningfully distinguish one coin from another, then their fungibility is weak. If our fungibility is too weak in practice, then we cannot be decentralized: if someone important announces a list of stolen coins they won't accept coins derived from, you must carefully check coins you accept against that list and return the ones that fail. Everyone gets stuck checking blacklists issued by various authorities because in that world we'd all not like to get stuck with bad coins. This adds friction and transactional costs and makes Bitcoin less valuable as a money.
Here, he speaks about the dangers derived from a lack of fungibility. Suppose that you have a UTXO. That UTXO's history can normally be traced back several hops, fanning out to multitudes of previous outputs. If any of those outputs were involved in any illegal, unwanted, or suspicious activity, then some potential recipients of your coin might reject it. If you think that your payees will verify your coins against some centralized whitelist or blacklist service, you might start checking the coins you receive too, just to be on the safe side. The result is that bad fungibility will bolster even worse fungibility.
Adam Back and Matt Corallo gave a presentation about fungibility at Scaling Bitcoin in Milan in 2016. They were thinking along the same lines:
You need fungibility for bitcoin to function. If you receive coins and can’t spend them, then you start to doubt whether you can spend them. If there are doubts about coins you receive, then people are going to go to taint services and check whether "are these coins blessed" and then people are going to refuse to trade. What this does is it transitions bitcoin from a decentralized permissionless system into a centralized permissioned system where you have an "IOU" from the blacklist providers.
It seems that privacy and fungibility go hand-in-hand. Fungibility will weaken if privacy is weak, for example as coins from unwanted people may become blacklisted. In the same way, privacy will weaken if fungibility is weak: if there is a blacklist, you will have to ask the blacklist providers about which coins to accept, thereby possibly revealing your IP address, email address, and other sensitive information. These two features are so intertwined that it's hard to talk about either of them in isolation.

Privacy measures

Several techniques have been developed to help people protecting themselves from privacy leaks. Among the most obvious ones is, as noted by Nakamoto earlier, using unique addresses for every transaction, but several others exist. We're not going to teach you how to become a privacy ninja. However, Bitcoin Q+A has a quick summary of privacy-enhancing technologies, somewhat ordered by how hard they are to implement. When you read it, you'll notice that Bitcoin privacy often has to do with stuff outside of Bitcoin. For example, you shouldn't brag about your bitcoins, and you should use Tor and VPN.
The post also lists some measures directly related to Bitcoin:
  • Full node: If you don't use your own full node, you will leak lots of information about your wallet to servers on the internet. Running a full node is a great first step.
  • Lightning Network: Several protocols exist on top of Bitcoin, for example the Lightning Network and Blockstream's Liquid sidechain.
  • CoinJoin: A way for multiple people to merge their transactions into one, making it harder to do chain analysis.
In a talk at the Breaking Bitcoin conference, Chris Belcher gave an interesting practical example of how privacy has been improved:
They were a bitcoin casino. Online gambling is not allowed in the US. Any customers of Coinbase that deposited straight to Bustabit would have their accounts shutdown because Coinbase was monitoring for this. Bustabit did a few things. They did something called change avoidance where you go through– and you see if you can construct a transaction that has no change output. This saves miner fees and also hinders analysis. Also, they imported their heavily-used reused deposit addresses into joinmarket. At this point, coinbase.com customers never got banned. It seems Coinbase’s surveillance service was unable to do the analysis after this, so it is possible to break these algorithms.
He also mentioned this example, among others, on the Privacy page on the Bitcoin wiki.
Note how better privacy can be achieved by building systems on top of Bitcoin, as is the case with Lightning Network:
Layers on top of Bitcoin can add privacy
We noted in the last chaper that the need for trust can only increase with layers on top, but that doesn't seem to be the case for privacy, which can be improved or made worse arbitrarily in layers on top. Why is that? Any layer on top of Bitcoin, as explained in the Layered Scaling paragraph in future chapter Scaling, must use on-chain transactions occasionally, otherwise it wouldn't be "on top of Bitcoin". Privacy-enhancing layers generally try to use the base layer as little as possible to minimize the amount of information revealed.
The above are somewhat technical ways to improve your privacy. But there are other ways. At the beginning of this chapter, we said that Bitcoin is a pseudonymous system. This means that users in Bitcoin aren't known by their real names or other personal data, but by their public keys. A public key is a pseudonym for a user, and a user can have multiple pseudonyms. In an ideal world, your in-person identity is decoupled from your Bitcoin pseudonyms. Unfortunately, due to the privacy problems described in this chapter, this decoupling usually degrades over time.
To mitigate the risks of having your personal data revealed is to not give it out in the first place nor to give it to centralized services, which build big databases that can leak. An article by Bitcoin Q+A explains KYC and the dangers derived from it. It also suggests some steps you can take to improve your situation:
Thankfully there are some options out there to purchase Bitcoin via no KYC sources. These are all P2P (peer to peer) exchanges where you are trading directly with another individual and not a centralised third party. Unfortunately some sell other coins as well as bitcoin so we urge you to take care.
The article suggests you avoid using exchanges that require KYC/AML and instead trade in private, or use decentralized exchanges like bisq.
For more in-depth reading about countermeasures, refer to the previously mentioned wiki article on privacy, starting at "Methods for improving privacy (non-blockchain)".

Conclusion about Privacy

Privacy is very important but hard to achieve. There is no privacy silver bullet.
To get decent privacy in Bitcoin, you have to take active measures, some of which are costly and time-consuming.
Quiz
Quiz1/5
What does it mean when Bitcoin is described as a pseudonymous system rather than an anonymous system?