Original Article Title: "Bitcoin's Duplicate Transactions"
Original Source: BitMEX Research
In the Bitcoin blockchain, there exist two sets of completely identical transactions, with one set of transactions "sandwiching" the other set, all occurring in mid-November 2010. Duplicate transactions can lead to confusion, and Bitcoin developers have been battling it in various ways over the years. This issue remains unresolved 100%, and the next potential duplicate transaction may occur in 2046. Although the risk associated with duplicate transactions is now very small, it remains an interesting and quirky bug worth pondering.
A regular Bitcoin transaction spends at least one previous transaction's output by referencing the transaction ID (TXID) of the prior transaction. These unspent outputs can only be spent once, and if they could be spent twice, you could double-spend Bitcoin, rendering it worthless. However, in Bitcoin, there exist exactly two sets of completely identical transactions. This anomaly is possible because the coinbase transaction has no transaction inputs but newly minted coins. Thus, two different coinbase transactions could potentially send the same amount to the same address and be constructed in an entirely identical manner, making them completely identical. Because these transactions are the same, their TXIDs also match as the TXID is a hash digest of the transaction data. The only other way TXIDs could be duplicated is through a hash collision, which is considered highly improbable and practically infeasible for a cryptographically secure hash function. Hash collisions like SHA256 have never occurred in Bitcoin or anywhere else.
These two sets of duplicate transactions occurred in close proximity, from 08:37 UTC on November 14, 2010, to 00:38 UTC on November 15, 2010, spanning approximately 16 hours. The first set of duplicate transactions was sandwiched between the second set. We classify d5d2…8599 as the first duplicate transaction because it became the duplicate first, although oddly, it first appeared on the blockchain after another duplicate transaction e3bf…b468.
In the images below, you can see two screenshots from the mempool.space block explorer, displaying the occurrence of the first duplicate transaction in two different blocks.
Interestingly, when entering the relevant URL in a web browser, the mempool.space block explorer defaults to showing an earlier block for the case of d5d2…8599 and a later block for the case of e3bf…b468. Blockstream.info and Btcscan.org exhibit the same behavior as mempool.space. On the other hand, based on our basic testing, Blockchain.com and Blockchair.com behave differently, always displaying the latest version of a conflicting transaction when entering the URL in the browser.
Out of the four relevant blocks, only one block (Block 91,812) includes a conflicting transaction. This transaction merged outputs of 1 BTC and 19 BTC into a single 20 BTC output.
Due to the presence of two sets of identical TXIDs, this creates a referencing issue for subsequent transactions. The value of each conflicting transaction is 50 BTC. Therefore, these conflicting transactions involve a total of 4 x 50 BTC = 200 BTC, or depending on the interpretation, could involve 2 x 50 BTC = 100 BTC. To some extent, 100 BTC effectively do not exist.
As of today, all 200 BTC remain unspent. From our understanding (which could be wrong here), if someone possesses the private keys associated with these outputs, they could spend these bitcoins. However, once spent, the UTXOs will be removed from the database, rendering the duplicate 50 BTC unspendable and lost, hence only 100 BTC potentially retrievable. As for which block these coins would spend from, whether it would be an earlier or recent one, that may be undefined or indeterminate.
This individual could have spent all bitcoins before creating the conflicting transactions, then created duplicate outputs, creating new entries in the unspent outputs database. This would mean not only duplicate transactions but potentially duplicate spent outputs' duplicate transactions. If this were to happen, more conflicting transactions could be generated when these outputs are spent, creating a chain of conflicts. Care must be taken in the sequence of events, always spending before creating conflicts, or bitcoins may be lost forever. These new conflicting transactions would not be coinbase transactions but "regular" transactions. Fortunately, this scenario has never occurred.
Double-spending is obviously a bad thing. It can cause confusion in wallets and block explorers, as well as obfuscate the origin of Bitcoin. It also opens up many attacks and vulnerabilities. For example, you could double-spend to pay someone twice. Then, when the recipient tries to spend that money, they might find that only half of it is spendable. This could be an attack on an exchange, attempting to bankrupt it, while the attacker incurs no loss as they can withdraw funds immediately upon deposit.
To address the issue of double-spending, in February 2012, Bitcoin developer Pieter Wuille proposed the BIP30 soft fork, which prohibited transactions using duplicate TXIDs unless the previous TXID had already been spent. This soft fork applied to all blocks after March 15, 2012.
In September 2012, Bitcoin developer Greg Maxwell modified this rule to make the BIP30 check apply to all blocks, not just those after March 15, 2012, with the exception of the two double-spending incidents mentioned earlier. This fixed some Denial-of-Service (DOS) vulnerabilities. Technically, this was another soft fork, although the rule change only applied to blocks older than 6 months, so there was no risk associated with normal protocol rule changes.
The computational cost of this BIP30 check is high. Nodes need to check all transaction outputs in new blocks and verify if these output points already exist in the UTXO set. This might be why Wuille only checked unused outputs, as checking all outputs would be more expensive and pruning would not be possible.
In July 2012, Bitcoin developer Gavin Andresen proposed the BIP34 soft fork, which was activated in March 2013. This protocol change required coinbase transactions to include the block height, enabling block versioning. The block height was added as the first item in the coinbase transaction scriptSig. The first byte of the coinbase scriptSig is the number of bytes the block height number uses, followed by the block height number itself. For the first 210,000 blocks (223 / (144 blocks per day * 365 days per year)), the first byte should be 0x03. That's why modern coinbase ScriptSigs (HEX) always start with 03. This soft fork appeared to definitively solve the double-spending problem, as all transactions should now be unique.
Due to the adoption of BIP34, in November 2015, Bitcoin developer Alex Morcos added a pull request to the Bitcoin Core software repository, which meant that nodes would stop performing the BIP30 check. After all, since BIP34 fixed this issue, this expensive check was no longer necessary. Although it was not known at the time, technically, this was a hard fork for some very rare blocks in the future. In hindsight, the potential hard fork turned out to be insignificant because almost no one was running node software before November 2015. At forkmonitor.info, we are running Bitcoin Core 0.10.3 released in October 2015. Therefore, this is a rule before the hard fork, with clients still performing the expensive BIP30 check.
It turns out that there were some coinbase transactions in blocks before the activation of BIP34, where the first byte of the scriptSig used at the time happened to match the future valid block height. Therefore, while BIP34 did indeed fix this issue in almost all cases, it was not a complete 100% fix. In 2018, Bitcoin developer John Newbery printed out the complete list of these potentially duplicate blocks, as shown in the table below.
*Note: These blocks had coinbase transactions in 2012 and 2017 and are not duplicates. 209,921 blocks (only 79 blocks away from the first halving) cannot be duplicates because BIP30 was enforced during this period.
Source: https://gist.github.com/jnewbery/df0a98f3d2fea52e487001bf2b9ef1fd
Number of potentially duplicate Coinbase transactions listed by year
Source: https://gist.github.com/jnewbery/df0a98f3d2fea52e487001bf2b9ef1fd
Therefore, the next block where a duplicate transaction could potentially occur is Block 1,983,702, expected around January 2046. In Block 164,384 mined in January 2012, the Coinbase transaction sent 170 BTC to seven different output addresses. So, if a miner in 2046 wanted to carry out this attack, they would not only need to be lucky enough to find this block but also would need to spend less than 170 BTC to burn, with a total cost slightly above 170 BTC, including the opportunity cost of 0.09765625 BTC from the block subsidy.
Based on the current Bitcoin price of $88,500, this would cost over $15 million. As for the ownership of the seven addresses from the 2012 Coinbase transaction, it is currently unknown, and the keys are likely lost. All seven output addresses of that Coinbase transaction have now been spent, with three of them spent in a single transaction. We suspect these funds may be related to the Pirate40 Ponzi scheme, but this is just speculation on our part. Therefore, this attack seems not only costly but also practically useless for the attacker. Removing the node from November 2015, a 31-year-old node, in a hard fork would be a significant expense.
The next vulnerable block that could potentially be duplicated is Block 169,985 from March 2012. This Coinbase transaction only spent just over 50 BTC, much lower than 170 BTC. Of course, 50 BTC was the subsidy at that time, and when this Coinbase transaction becomes easily duplicable in 2078, the subsidy will be much lower. To take advantage of this, miners would need to spend around 50 BTC in a way that they cannot recover because these funds have to go into the old outputs from 2012. Nobody knows what the price of Bitcoin will be in 2078, but the cost of such an attack could be staggering. Therefore, while this issue may not be a primary risk for Bitcoin, it is still a cause for concern.
Since the SegWit upgrade in 2017, Coinbase transactions can also include a commitment to all transactions in a block. These pre-BIP34 blocks do not include a witness commitment. Hence, to create a duplicate Coinbase transaction, a miner would need to exclude any SegWit output redemptions from the block, further increasing the opportunity cost of the attack as the block may not be able to include many other fee-paying transactions.
Considering the difficulty and cost of duplicating transactions and how rare the opportunity to exploit them is, this transaction duplication vulnerability does not appear to be a major security concern for Bitcoin. Nevertheless, given the time scale involved and the novelty of duplicate transactions, it's an interesting thought experiment. Nonetheless, developers have spent a considerable amount of time on this issue over the years, and the date 2046 may be considered by some developers as the final deadline to address this problem. There are many potential methods to fix this issue, likely requiring a soft fork. One possible fix could be the enforced SegWit commitment.
Welcome to join the official BlockBeats community:
Telegram Subscription Group: https://t.me/theblockbeats
Telegram Discussion Group: https://t.me/BlockBeats_App
Official Twitter Account: https://twitter.com/BlockBeatsAsia