How to linearize your cluster

sipa · February 6, 2025, 6:16pm

Just in case I confused you here: I was trying to explain why this ascending property you mention later wasn’t obvious to me earlier. It doesn’t matter, it makes sense to me now.

My question was whether I had understood things correctly; given your responses, I believe that was indeed the case.

I see now. That may work, but it’s probably premature optimization. Just evaluating the new diagram at all the old diagram breakpoints may be just as hard as just computing the new diagram?

That may be somewhat unfortunate. I was hoping it would be possible to spend some time finding some cut (not necessarily the minimal one), and then later revisit and find a better cut. Because even just finding a single min-cut is O(n^3) as I understand it (if m = O(n^2)), which is probably too much.

Right, but doing this in a “time-restricted” setting means you might end up with a linearization where the beginning is optimal, but the end is terrible, which might be undesirable.

You can think of LIMO as repeatedly doing:

Given a cluster with an existing linearization L for it
- Loop:
  - Find a topological subset S with good feerate to move to the front.
  - Compute L’, which is L with with S moved to the front (leaving the order of transactions within S, and outside of S, unchanged from L).
  - Compute L’’ as a merge of L and L’, which has a diagram that’s at least as good as the best of both everywhere.
  - Output the first chunk of L’‘, and continue with L as what remains of L’'.

I suspect this can be done in combination with the GGT approach, but the more interesting combination is if it can feed back too, i.e. the S set finding algorithm can be bootstrapped by using the first chunk of L at that point. This may be harder.

I’m currently polishing up an implementation of the spanning-forest algorithm, so that I don’t forget the ideas I got while creating the writeup, and also to have a benchmark to target for future things (I think that comparing with the current exponential algorithm will be hard, as the style of graphs which are hard may differ wildly between exponential and GGT, but the difference between spanning-forest and GGT is probably smaller). After that, I plan to dig deeper into minimal-cut and GGT.