Post-clustermempool package RBF: per-chunk processing

sdaftuar · May 23, 2024, 2:35pm

There’s a potential pinning issue introduced by using feerate diagram comparisons alone as our RBF tool, which I wanted to describe so we don’t forget this when we get to doing general package RBF.

Imagine a transaction V with high feerate that’s already in the mempool. An adversary is looking to replace V with some low-feerate, high fee transaction that will be expensive to replace (due to the total fee test).

Using the package RBF scheme described here, one possible way for the attacker to do this would be to prepare some big cluster that has more transactions in it than we can optimally linearize. Within it, imagine that we have some part of the transaction graph that (say) looks like this:

   graph TD
   A[some big cluster of transactions]--> P["P (low fee)"]
   P-->B["B (high fee)"]
   P-->C["C (high fee)"]

This might go undiscovered as a good single chunk in our linearization, because our heuristics might have a hard time picking this up (eg it’s overlooked in ancestor feerate sorting). However, a single transaction that spends B and C, even at a very low feerate, will cause the set to be considered and possibly improve the overall linearization quite a lot. So imagine an attacker constructs two transactions, D and E, where D spends B and C, while E conflicts with V:

   graph TD
   V["V (high feerate)"]
   A[some big cluster of transactions]--> P["P (low fee)"]
   P-->B["B (high fee)"]
   P-->C["C (high fee)"]
   B-->D["D (low fee)"]
   C-->D
   D-->E["E (slightly higher feerate than D, but very big and conflicts with V)"]

We could imagine that merely looking at D being added to the cluster causes the cluster’s feerate diagram to improve (because we discover the chunk [P, B, C]), so that the RBF that considers adding D+E vs V succeeds – even though the good transactions which lead to the improvement were already in the mempool that contains V as well.

And then once E is in the mempool, it’s expensive for V to replace it… Pinning complete.

Really, the attacker is able to exploit two things in this scenario:

Being able to run the linearization algorithm one extra time compared with the existing mempool. If we do a bounded search, say, getting more search time can randomly cause a better linearization to be found, which benefits the new transaction when compared with the old.
Being able to exploit limitations of our linearization algorithm. An attacker can intentionally construct a cluster that will linearize poorly using our various heuristics, but when a new leaf transaction is added suddenly sorts better.

Also, this problem exists with how I envisioned single-transaction RBF to work as well (imagine D and E were a single transaction, rather than two, which conflicts with V). We’re exploring whether it might be possible to fix the single-RBF case by requiring the chunk feerate of the new transaction to be better than its conflicts, but I am not sure any such heuristic would work in the package RBF case without being too draconian.