Progress pill
Understanding and protecting against chain analysis

Putting it into practice with a block explorer

Privacy on Bitcoin

Putting it into practice with a block explorer

  • Exercise 1
  • Exercise 2
  • Exercise 3
  • Exercise 4
  • Exercise 5
  • Exercise 6
  • Exercise 7
  • Exercise 8
  • Exercise solutions
In this final chapter, we will apply the concepts we've studied so far in practice. I'm going to show you examples of real Bitcoin transactions, and you'll have to extract the information I'm asking you for.
Ideally, to perform these exercises, the use of a professional chain analysis tool would be preferable. However, since the arrest of the creators of Samourai Wallet, the only free analysis tool, OXT.me, is no longer available. We'll therefore opt for a classic block explorer for these exercises. I recommend using Mempool.space for its many features and range of chain analysis tools, but you can also opt for another explorer such as Bitcoin Explorer.
To begin with, I'll introduce you to the exercises. Use your block explorer to complete them, and write down your answers on a sheet of paper. Then, at the end of this chapter, I'll provide you with the answers so you can check and correct your results.
The transactions selected for these exercises have been chosen purely for their characteristics in a somewhat random fashion. This chapter is intended for educational and informative purposes only. I would like to make it clear that I neither support nor encourage the use of these tools for malicious purposes. The aim is to teach you how to protect yourself against string analysis, not to conduct analysis to expose other people's private information.

Exercise 1

Identifier of the transaction to be analyzed:
3769d3b124e47ef4ffb5b52d11df64b0a3f0b82bb10fd6b98c0fd5111789bef7
What is the name of this transaction's model, and what plausible interpretations can be drawn by examining only its model, i.e., the structure of the transaction?

Exercise 2

Identifier of the transaction to be analyzed:
baa228f6859ca63e6b8eea24ffad7e871713749d693ebd85343859173b8d5c20
What is the name of this transaction's model, and what plausible interpretations can be drawn by examining only its model, i.e., the structure of the transaction?

Exercise 3

Identifier of the transaction to be analyzed:
3a9eb9ccc3517cc25d1860924c66109262a4b68f4ed2d847f079b084da0cd32b
What is the model for this transaction?
Having identified its model, using the transaction's internal heuristics, what output is the change likely to represent?

Exercise 4

Identifier of the transaction to be analyzed:
35f0b31c05503ebfdf7311df47f68a048e992e5cf4c97ec34aa2833cc0122a12
What is the model for this transaction?
Having identified its model, using the transaction's internal heuristics, what output is the exchange likely to represent?

Exercise 5

Let's imagine that Loïc has posted one of his Bitcoin receiving addresses on the social network Twitter:
bc1qja0hycrv7g9ww00jcqanhfpqmzx7luqalum3vu
Based on this information and using only the address reuse heuristic, which Bitcoin transactions can be linked to Loïc's identity?
Obviously, I'm not the real owner of this reception address and I didn't post it on social networks. It's an address I took randomly from the blockchain

Exercise 6

Following exercise 5, thanks to the address reuse heuristic, you were able to identify several Bitcoin transactions in which Loïc seems to be involved. Normally, among the transactions identified, you should have spotted this one:
2d9575553c99578268ffba49a1b2adc3b85a29926728bd0280703a04d051eace
This transaction is the very first to send funds to Loïc's address. Where do you think the bitcoins received by Loïc via this transaction came from?

Exercise 7

Following exercise 5, thanks to the address reuse heuristic, you've been able to identify several Bitcoin transactions in which Loïc seems to be involved. Now you want to find out where Loïc came from. Based on the transactions found, perform a time analysis to find the time zone most likely used by Loïc. From this time zone, determine a location where Loïc seems to live (country, state/region, city...).

Exercise 8

Here is the Bitcoin transaction to study:
bb346dae645d09d32ed6eca1391d2ee97c57e11b4c31ae4325bcffdec40afd4f
Looking at this transaction alone, what information can we interpret?

Exercise solutions

Exercise 1:
The model for this transaction is the simple payment model. If we study only its structure, we can interpret that one output represents the change and the other output represents an actual payment. We therefore know that the observed user is probably no longer in possession of one of the two UTXOs in output (that of payment), but is still in possession of the other UTXO (that of the change).
Exercise 2:
The model for this transaction is that of grouped spending. This model probably reveals a large-scale economic activity, such as an exchange platform. We can deduce that the input UTXO comes from a company with a high level of economic activity, and that the output UTXOs will be scattered. Some will belong to company customers who have withdrawn their bitcoins to self-custody wallets. Others may go to partner companies. Finally, some changes will undoubtedly be passed back to the issuing company.
Exercise 3:
The model for this transaction is a simple payment. We can therefore apply internal heuristics to the transaction to try to identify the change.
I have personally identified at least two internal heuristics that support the same hypothesis:
  • The reuse of the same type of script;
  • The largest output.
The most obvious heuristic is that of reusing the same type of script. Indeed, output 0 is a P2SH, recognizable by its reception address starting with 3:
3Lcdauq6eqCWwQ3UzgNb4cu9bs88sz3mKD
While output 1 is a P2WPKH, identifiable by its address starting with bc1q:
bc1qya6sw6sta0mfr698n9jpd3j3nrkltdtwvelywa
The UTXO used as input for this transaction also uses a P2WPKH script:
bc1qyfuytw8pcvg5vx37kkgwjspg73rpt56l5mx89k
Thus, we can assume that output 0 corresponds to a payment and output 1 is the transaction change, which would mean that the input user always owns output 1.
To support or refute this hypothesis, we can look for other heuristics that either confirm our thinking or decrease the probability that our hypothesis is correct.
I've identified at least one other heuristic. It's the largest output heuristic. Output 0 measures 123,689 sats, while output 1 measures 505,839 sats. There is, therefore, a significant difference between these two outputs. The largest output heuristic suggests that the largest output is likely to be the change. This heuristic further strengthens our initial hypothesis.
It therefore seems likely that the user who supplied the UTXO as input still holds the 1 output, which seems to embody the transaction's change.
Exercise 4:
The model for this transaction is a simple payment. We can therefore apply internal heuristics to the transaction to try to identify the change.
I have personally identified at least two internal heuristics that support the same hypothesis:
  • The reuse of the same type of script;
  • The round amount output.
The most obvious heuristic is that of reusing the same type of script. Indeed, output 0 is a P2SH, recognizable by its reception address starting with 3:
3FSH5Mnq6S5FyQoKR9Yjakk3X4KCGxeaD4
While output 1 is a P2WPKH, identifiable by its address starting with bc1q:
bc1qvdywdcfsyavt4v8uxmmrdt6meu4vgeg439n7sg
The UTXO used as input for this transaction also uses a P2WPKH script:
bc1qku3f2y294h3ks5eusv63dslcua2xnlzxx0k6kp
Thus, we can assume that output 0 corresponds to a payment and output 1 is the transaction change, which would mean that the input user always owns output 1.
To support or refute this hypothesis, we can look for other heuristics that either confirm our thinking or decrease the probability that our hypothesis is correct.
I've identified at least one other heuristic. It's the round amount output. Output 0 measures 70,000 sats, while output 1 measures 22,962 sats. We therefore have a perfectly round output in the BTC unit of account. The round output heuristic suggests that the UTXO with a round amount is most likely that of payment, and that by elimination, the other represents the change. This heuristic further strengthens our initial hypothesis.
However, in this example, another heuristic could challenge our initial hypothesis. Indeed, output 0 is greater than output 1. Based on the heuristic that the largest output generally represents the change, we can deduce that output 0 represents the change. However, this counter-hypothesis seems implausible, as the other two heuristics appear substantially more convincing than the largest output heuristic. Consequently, it seems reasonable to maintain our initial hypothesis despite this apparent contradiction.
It therefore seems likely that the user who supplied the UTXO as input still holds the 1 output, which seems to embody the transaction's change.
Exercise 5:
We can see that 8 transactions can be associated with Loïc's identity. Of these, 4 involve the receipt of bitcoins:
2d9575553c99578268ffba49a1b2adc3b85a29926728bd0280703a04d051eace 8b70bd322e6118b8a002dbdb731d16b59c4a729c2379af376ae230cf8cdde0dd d5864ea93e7a8db9d3fb113651d2131567e284e868021e114a67c3f5fb616ac4 bc4dcf2200c88ac1f976b8c9018ce70f9007e949435841fc5681fd33308dd762
The other 4 concern bitcoin shipments:
8b52fe3c2cf8bef60828399d1c776c0e9e99e7aaeeff721fff70f4b68145d540 c12499e9a865b9e920012e39b4b9867ea821e44c047d022ebb5c9113f2910ed6 a6dbebebca119af3d05c0196b76f80fdbf78f20368ebef1b7fd3476d0814517d 3aeb7ce02c35eaecccc0a97a771d92c3e65e86bedff42a8185edd12ce89d89cc
Exercise 6:
Upon examining the model of this transaction, it becomes clear that it involves a bundled expenditure. Indeed, the transaction has a single input and 51 outputs, indicating a high level of economic activity. We can therefore hypothesize that Loïc has withdrawn bitcoins from an exchange platform.
Several factors reinforce this hypothesis. Firstly, the type of script used to secure the UTXO input is a P2SH 2/3 multisig script, which indicates an advanced level of security typical of exchange platforms:
OP_PUSHNUM_2 OP_PUSHBYTES_33 03eae02975918af86577e1d8a257773118fd6ceaf43f1a543a4a04a410e9af4a59 OP_PUSHBYTES_33 03ba37b6c04aaf7099edc389e22eeb5eae643ce0ab89ac5afa4fb934f575f24b4e OP_PUSHBYTES_33 03d95ef2dc0749859929f3ed4aa5668c7a95baa47133d3abec25896411321d2d2d OP_PUSHNUM_3 OP_CHECKMULTISIG
What's more, the address studied 3PUv9tQMSDCEPSMsYSopA5wDW86pwRFbNF is reused in over 220,000 different transactions, which is often characteristic of exchange platforms, generally unconcerned about their confidentiality.
The temporal heuristic applied to this address also shows a regular broadcast of transactions almost daily over a 3-month period, with extended hours over 24 hours, suggesting the continuous activity of an exchange platform.
Finally, the volumes handled by this entity are colossal. The address received and sent 44 BTC in 222,262 transactions between December 2022 and March 2023. These large volumes further confirm the likely nature of an exchange platform's activity.
Exercise 7:
By analyzing the transaction confirmation times, the following UTC times can be identified:
05:43 20:51 18:12 17:16 04:28 23:38 07:45 21:55
An analysis of these schedules shows that UTC-7 and UTC-8 are consistent with a range of current human activity (between 08:00 and 23:00) for the majority of schedules:
05:43 UTC > 22:43 UTC-7 20:51 UTC > 13:51 UTC-7 18:12 UTC > 11:12 UTC-7 17:16 UTC > 10:16 UTC-7 04:28 UTC > 21:28 UTC-7 23:38 UTC > 16:38 UTC-7 07:45 UTC > 00:45 UTC-7 21:55 UTC > 14:55 UTC-7 05:43 UTC > 21:43 UTC-8 20:51 UTC > 12:51 UTC-8 18:12 UTC > 10:12 UTC-8 17:16 UTC > 09:16 UTC-8 04:28 UTC > 20:28 UTC-8 23:38 UTC > 15:38 UTC-8 07:45 UTC > 23:45 UTC-8 21:55 UTC > 13:55 UTC-8
The UTC-7 time zone is particularly relevant in summer, as it includes states and regions such as:
  • California (with cities such as Los Angeles, San Francisco, and San Diego);
  • Nevada (with Las Vegas);
  • Oregon (with Portland);
  • Washington (with Seattle);
  • The Canadian region of British Columbia (with cities like Vancouver and Victoria).
This information suggests that Loïc is likely to reside on the west coast of the United States or Canada.
Exercise 8:
Analysis of this transaction reveals 5 inputs and a single output, suggesting consolidation. Applying the CIOH heuristic, we can assume that all the input UTXOs are owned by a single entity and that the output UTXO also belongs to this entity. It appears that the user chose to combine several UTXOs he owned to form a single UTXO in the output, with the aim of consolidating his UTXOs. This move was likely motivated by a desire to capitalize on the low transaction costs of the time, thereby reducing future costs.

To write this part 3 on chain analysis, I drew on the following resources:
I'd like to thank their authors, developers and producers. Thanks also to the proofreaders who meticulously corrected the article on which this part 3 is based, and gave me their expert advice: