Hmm, poking at this some more, I think my previous post went too far into the miners-all-cooperate edge case.
If you consider some pinning tx that won’t be worth mining for 100 blocks, and a stratumv2 mining pool with only 98% hashrate that keeps the pinning tx hoping for its high fees, that’s still leaves an 86% chance that the remaining 2% hashrate mines a block a block in the meantime with the high feerate tx, invalidating the pinning tx. In that case, the pinning tx needs to pay h^{1-n} times as much as the pinned tx, or the pool would have had a higher expected value for just trying to mine the higher feerate/lower fee pinned tx instead. For a pool with 100% hashrate, that devolves to always requiring a higher fee independent of n; but for a pool with 90% hashrate, it’s already 100x fee at about 45 blocks; at 50% hashrate, it’s 500x fee at 10 blocks. For the 98% pool at 100 blocks, it’s a factor of about 7.4.
That model may give a strong incentive to defect, if you get some direct benefit from fees, other than just “the pool has more funds to distribute”. For example, if a sv2 pool finds that some percent of hashpower is just building working on empty blocks, perhaps it will decide to issue rewards rated not just by number of shares found, but also by the potential reward that those blocks would return to the pool. In that case, defecting and attempting to mine the replacement transaction would increase your share of rewards in the short term, and, provided the pool doesn’t check for that sort of defection and punish it, and without having done the maths, I think that would end up being attractive, at least, for small miners within the pool.
I think that still adds up to “the replacement tx needs to pay a fee proportional to the total fee of the tx being replaced” though – along with having a threshold along the lines of “don’t replace txs you expect will confirm in the next 50 blocks” – but at least it allows that proportion to be substantially less than 1.
I think in this view, you can conclude that as long as you don’t have a mining pool/cartel with over 95% hashrate, than replacing a tx that won’t be mined for 135 or more blocks with one that would be mined immediately is fine unless the total fee of the existing tx is more than 1000x that of the new one (pretty implausible as it would mean even a 100vb tx replacement would be paying a lower feerate than a max size 100kvb tx), as is replacing a tx whose total fee is under 10x and won’t be mined for 50 or more blocks.
Those numbers seem much more reasonable, though I’m not sure how you’d make any of this practical to implement – both the “this tx won’t be mined for n blocks” and “h percent of hashpower will wait to try to collect fees from the pinning tx” aren’t really things you can measure very accurately… Dealing with a 50 or 135 block delay might not be very practical for many applications either.