Great Consensus Cleanup Revival

As mentioned [1], I think the worst case proof size now is actually ~1 MB. For example, imagine we have the following block (P2P block serialization):

Bytes Description
80 Header
1 Tx count
999,826 Coinbase tx
93 1-in, 1-out P2TR (stripped size)

A lite client performing whitepaper-style SPV can be tricked into accepting a fake transaction if the coinbase transaction was actually 64 bytes. To avoid that, it needs to learn the contents of the entire coinbase transaction in order to derive its txid for verifying its depth in the merkle tree. That means, even with optimizations, a proof size of about 1 MB.

Obviously, very large coinbase transactions will be rare given that they reduce miners’ ability include fee-paying transactions, but I think it’s worth noting in discussion and documentation that the worst case is ~1 MB. It should still be possible to validate a worst-case merkle proof with coinbase in witness data (given other soft fork changes), but it would be ~2,000x more expensive than validating a merkle proof that didn’t require a copy of the entire coinbase transaction.

@evoskuil mentioned an alternative potential soft fork: a commitment to tree depth, which could be done for any depth possible with the current consensus rules[1] using only 4 bits. He didn’t suggest where the commitment could be stored, but I think it’s clear that we’re probably never going to use all BIP8/9 versionbits and miners currently seem satisfied with the 16 BIP320 version bits, meaning we could probably put the commitment it the block header version. That wouldn’t require any extra bandwidth for SPV.

I think the two cons of that approach are:

  • We might want to give more (eventually all of the available) versionbits to miners in the future as hashrate increases; that way individual hardware doesn’t need to create coinbase extranonces or use hacks like nTime rolling.
  • It still makes SPV more complicated than described in the whitepaper, although only slightly more so.

[1] 4 bits can express a maximum depth of 16 and a tree of depth 16 can have up to 65,536 transactions. However, the minimum possible transaction size is 60 bytes and a 999,919-byte block (excluding header and tx count) can only fit a maximum of 16,665 transactions of that size.

1 Like

I don’t think that’s strictly true? sha256 includes the size of its input when calculating the hash, so implicit in being provided a midstate is being provided the number of bytes that went into the midstate, in bitcoin that’s s (the midstate), buf (the remaining bytes that will build the next chunk) and bytes (the number of bytes that went into both those things). So a preimage for the txid of a very large coinbase tx can still be reduced to as few as 104 bytes. That requires you have some way of differentiating a “valid” 64 byte coinbase tx from an internal node of the tx merkle tree, but it’s easy to transmit the full coinbase tx in that case anyway.

Compared to the status quo, it improves the correctness of block inclusion proofs. Compared to including the coinbase in the proof, I think the reduction in complexity is more of an advantage tbh:

  • you need a merkle path, and also to check the witness-stripped tx isn’t 64 bytes
  • you need a combined merkle path to both the coinbase and the tx; you need to check they’re the same depth; you need to check the coinbase is valid; in order to minimise bandwidth you need to deal with sha256 midstates
  • you need to lookup the block’s merkle tree depth via (TBD), (except if the block height is prior to the activation height, in which case you’ll need to use this fixed table and hope there’s no absurdly large reorg) then check the merkle path is exactly that long

Perhaps worth noting that these merkle proofs will only give you the witness-stripped tx anyway – in order to verify the tx’s witness data you need the full coinbase in order to obtain the segwit commitment (which can be essentially anywhere in the coinbase tx), and then you need the matching merkle path from that to your non-stripped tx. So for a lite node, checking the stripped size of the tx is straight-forward – that’s all the data it’s likely to be able to verify anyway.

It occurs to me that most of the time, you’re probably more interested in whether an output is (was) in the utxo set, rather than whether a tx is in a block. But I think the concerns here are irrelevant for things like utreexo that have some new merkle tree: they can use a prefix to distinguish tree contents from internal nodes.

1 Like

It took me a while to figure out how knowing the number of bytes used to create a digest was an alternative to validating the structure of a coinbase transaction. Do I accurately understand that you propose the following:

  1. Prover provides the coinbase transaction midstate (including size) for all but the final chunk of the first of the double SHA256 hashes, plus the final chunk.

  2. Prover provides a partial merkle branch for the coinbase transaction.

  3. Verifier uses the midstate (including size) and the final chunk to generate the coinbase txid.

  4. Verifier checks all provided nodes in the partial merkle branch are on the right side (i.e., that the branch is for the first transaction in a block, i.e. the coinbase transaction).

  5. Verifier inserts its generated coinbase txid into the partial merkle branch and verifies up the tree until it connects to the merkle root in the block header.

After the above process, the verifier can be certain of the following:

  • The number of bytes used to generate the digest (txid).

    • If the number of bytes is not equal to 64, then the digested data must have been a transaction (or the block was invalid).

    • If the number of bytes is equal to 64, then the digested data may have been either a transaction or concatenated digests for an internal merkle node. The verifier can distinguish between these by checking whether the 64-byte chunk provided to it matches a coinbase transaction pattern.

  • The verified transaction is a coinbase transaction given the inability to use CVE-2017-12842 to forge coinbase transactions and the position of the transaction in the merkle tree (or the block was invalid).

  • The depth of the coinbase transaction, which is the same depth as all other transactions in the tree.

I agree, as the average absolute bandwidth is trivial, can be easily reduced by another 36 bytes, and can be amortized over an entire block of transactions.

And as the above implies, worst case should not be a consideration given that it is reasonably bounded.

Given that bandwidth is not a material issue, adding sha256 midstates is unnecessary complexity.

Also, by simply compacting out the excess 36 bytes as mentioned above there is no need to validate the coinbase. Reconstituting it with the presumed null point is sufficient (e.g. defaulting a tx deserialization with no input point to a null input point).

Consequently the complexity comparison becomes:

  • Ensure that the stripped transaction is not 64 bytes, and ensure that its Merkle path is valid.

vs:

  • Ensure that the (reconstituted) coinbase Merkle path is valid, and ensure the same valid Merkle depth for any tx in the block.

Validating the coinbase Merkle path is of course code reuse, so the new logic is checking the Merkle path size. This is the tradeoff in the SPV wallet. The tradeoff in the full node is mandatory checking of the stripped size of all transactions (and soft fork) vs. doing nothing (possibly “compressing” null points for SPV clients).

Cross referencing this SV2-related post here, as I brought up GCC as a point of relevance on that discussion.