Successor to [DEFUNCT] Post-clustermempool Package RBF. I felt starting over in a new topic is a bit cleaner.
Starting from a minimal design
As a thought experiment, start with a minimal design: the relayed package must be a single chunk (when not considering in-mempool ancestors) after linearization, and it is only ever considered in its entirety, or not at all. This avoids the complexity of trying to optimize finding the best subset to take subject to DoS constraints, is sufficient for the one-parent-one-child case, and computationally speaking, only involves a single feerate diagram comparison (the most potentially expensive step). I feel this would be a perfectly reasonable candidate design to consider for a first implementation.
A straightforward generalization of this is possible: if the relayed package (excluding in-mempool ancestors) after linearization consists of multiple chunks, process them completely separately (exactly as if they were submitted one by one). This makes it not gratuitously reject packages due to the receiver unexpectedly splitting the linearization up as a result of having some subpackage already, but beyond that doesnât provide any âqualityâ that package-must-be-one-chunk wouldnât have. Itâs also not computationally worse - if multiple diagram checks are too expensive, processing can be interrupted (like GETDATA etc.) between chunks to limit P2P message processing latency. And since itâs exactly equivalent to an attacker just giving multiple packages, there is no additional DoS concern either.
Maybe that generalization is actually what we want already.
Resulting proposal
To be explicit, here is what that would amount to for a received package (set of transactions) PKG
:
-
- Deduplication: remove from
PKG
any transaction thatâs already in the receiverâs mempool.
- Deduplication: remove from
-
- Pre-linearization: linearize (what remains of)
PKG
(so without in-mempool dependencies).
- Pre-linearization: linearize (what remains of)
-
- Splitting: For each chunk
CNK
in that linearization (in order of decreasing feerate):
-
- Limiting: If
feerate(CNK) < max(mempoolminfee, incremental_relay_feerate)
, stop processing (of this and all further chunks). This is necessary to avoid tiny-increment replacements at the bottom of the mempool, and can cheaply be done early on. This only works because any low-feerate suffix of the linearization will, if accepted to the mempool, become a low-feerate set there too (and possibly even lower, if itâs merged with other things). Thus, this lets us efficiently predict and prevent additions to the mempool that would be below its minimum feerate. This aspect does not work pre-clustermempool.
- Limiting: If
-
- Conflict-finding: find the set
CON
, the in-mempool conflicts withCNK
.
- Conflict-finding: find the set
-
- Replacement checks: Only necessary if
CON
is non-empty:
-
- Relay check: If
fee(CNK) < fee(CON) + incremental_relay_feerate * size(CNK)
, then the replacement does not pay for relay of the chunk. If so, stop processing this chunk and continue with further ones (which may fail due to having dependencies in this failed chunk).
- Relay check: If
-
- Tail check: if
fee(CNK) < fee(CON) + tail_feerate * (size(CNK) - size(CON))
, the replacement worsens the bottom of the mempool, so continue with the next chunk. This check is only necessary iftail_feerate > incremental_relay_feerate
.
- Tail check: if
-
- Gather OLD: the union of all in-mempool clusters that contain elements of
CON
, or contain ancestors ofCNK
.
- Gather OLD: the union of all in-mempool clusters that contain elements of
-
- Compute NEW:
NEW = OLD - CON + CNK
.
- Compute NEW:
-
- Linearization: linearize all clusters in
NEW
(note that there can be multiple; a replacement can split up clusters).
- Linearization: linearize all clusters in
-
- Diagram check: verify that the fee-size diagram of
NEW
is nowhere worse than that ofOLD
, and at least in some place better. If not, the replacement is not necessarily an improvement to the mempool, and continue with the next chunk.
- Diagram check: verify that the fee-size diagram of
- Replacement checks: Only necessary if
-
- Verification: verify all transactions in
CNK
under standard/policy rules (including script verification). If this fails, continue with the next chunk. Otherwise, the chunk is a valid and desirable addition/replacement to the mempool.
- Verification: verify all transactions in
-
- Eviction Drop all of
CON
from the mempool (if any).
- Eviction Drop all of
-
- Addition Add all of
CNK
to the mempool, performing consensus validation to prime execution/script caches, and have a last-resort check that consensus validity is implied by policy validity.
- Addition Add all of
-
- Interruption If needed, interrupt processing to allow other peersâ messages
- Splitting: For each chunk
Good enough?
The result of this whole thing is that it does result in a limited âfind better subset of new package to acceptâ over only considering all/nothing of package, but this seems more of a side-effect of trying to make sure things donât gratuitously break when the receiverâs mempool differs, than being a goal on itself.
Another observation is that this does not permit earlier chunksâ replacements to pay for the relay of later chunks (every chunk needs to pay for itself), but it does permit replacements within one chunk to pay for the later transactions in the same chunk. Consider this example (sats/bytes
notation, mempoolminfee=1 sat/byte
):
graph BT
B["Tx B: 500/100"] --> A["Tx A: 3000/1000"];
B x--x |conflict| Bp["Tx B': 1000/500"];
If only B'
is in the mempool already, then the package A,B
will be accepted by the above rules, because âAâ helps pay the fee for the relay of B'
. If however A,B'
were in the mempool already, then B
would not be accepted as the net fee would be negative (and thus certainly not enough to pay for the relay of B
).
Because B
is âappropriatingâ the fee from A
to pay for the relay of B'
, the replacement breaks if the receiver already has A
. In principle, that means an attacker that sees A,B
could choose to only pass on A
to a victim, resulting in B
becoming (semi-permanently) unacceptable to the victim. This may be undesirable to the author of B
, but nothing can be done about this without fundamentally different anti-DoS rules. Itâs only possible to be more restrictive and not permit the replacement at all (e.g. by trying to split chunks up into even smaller pieces if theyâre below mempoolminfee
). Instead, I believe the observation is just that appropriation of parent fees cannot be guaranteed to work, as it depends on how nodes see the transactions packaged up, so users should be advised to avoid it (and itâs rather strange anyway, perhaps there arenât really use cases for it regardless).
And I think thatâs also a justification why it is acceptable that this wouldnât work across chunks: since appropriation cannot cannot be relied on for the same reason (receiver may have higher chunk already), we donât need to spend effort on making appropriation work well (but if itâs easier to permit it, like within-chunk, thatâs also not a problem).
Iâm not sure it is also acceptable that it doesnât work across chunks in the other direction. It seems reasonable that weâd want to permit child fees to pay for evictions necessary for their parents. If so, if the Relay check (7) fails, instead of stopping of processing, perhaps the loop can continue, but with the combination of all unprocessed chunks so far as EDIT: this is likely not necessary, and leads to complex interactions between feerate and fee checks.CNK
? That actually starts to sound a lot like the approach from PR26711, except gathering until DoS checks are satisfied, rather than gathering until feerate checks are satisfied.
In a way, this design is pretty much taking what weâre allowing today (submitting a transaction one by one), but now also permit grouping transactions together as little as possible while retaining the newly desired functionality: that higher-feerate children can be grouped with their parents at processing time.