How to linearize your cluster

sipa · January 29, 2025, 2:05pm

Hi @stefanwouldgo, crazy! I see a vague relation with min-cut / max-flow problems, but would never had thought to look in this direction.

I haven’t been able to find the publication yet, based on what I find for related algorithm, but I assume n is the number of nodes (transactions), and m is the number of arcs (dependencies)? If so, that sounds great if it is practical as well. Note that the number of dependencies itself may be up to quadratic in the number of transactions, so this is essentially cubic in n?

In the abstract of this paper, an additional condition is present (emphasis mine):

We present a simple sequential algorithm for the maximum flow problem on a network with n nodes, m arcs, and integer arc capacities bounded by U. Under the practical assumption that U is polynomially bounded in n, our algorithm runs in time O(nm + n^2 \log n).

It’s not clear if that condition is also present in the algorithm you’re citing, but if it is, that may be a problem.

Difference between what?

This seems great news to me. It should mean that we can accomodate much larger clusters.

Maybe. The cluster size bound is really informed by how big of a cluster we can linearize with “acceptable” quality (scare-quotes, because it isn’t all that clear what acceptable should mean), not what we can linearize optimally, in an extremely small amount of time (we’ve imposed a 50 µs ourselves, because a single transaction may simultaneously affect many clusters, which would all require re-linearization).

So far, we’ve interpreted acceptable as “at least as good as what we had before” (through LIMO), and “at least as good as ancestor sort” (the current CPFP-accomodating mining algorithm, which is inherently O(n^2). But (so far) all the ideas in this thread are “extra”, in the sense that they’d only be applied when we have time left, or in a background re-linearization process, that does not act until after transaction relay.

Now, it is quite possible that this algoritm is so fast that it can linearize larger clusters optimally in the time that ancestor sort + LIMO may need for smaller ones. That would obviously move the needle. Or if it’s possible that a reasonable argument can be made that a time-bounded version of this algorithm (only performing a lower-than-optimal number of cuts, for example, and then stopping) results in something that is practically as good as ancestor-sort (e.g., sufficient to make CPFP in typical, but potentially adverserial, scenarios work).

That’s a concern, because we don’t just care about asymptotic complexity, but real performance for relatively small problems too. And complicated algorithms, with complicated data structures, tend to result in things with high start-up costs (to convert the problem into the right representation), or with high constant factors (e.g., for some small problems, (sorted) lists can be several times faster than hash maps).

This is probably the point where I should reveal that we’ve been working on a new cluster linearization algorithm too (which I will write a bigger post about in time). It’s a result of a conversation we had at the Bitcoin Research Week, where Dongning Guo and Aviv Zohar pointed out that the “find highest-feerate topologically-valid subset” problem can be formulated as a Linear Programming problem, on which all LP solving methods are applicable. This implies two things:

Through Interior-Point Methods, the problem can be solved in O((n+m)^{2.5} \log S) time, where S is the sum of transaction sizes, or essentially O(n^5) in just n.
The simplex algorithm, while worst-case exponential in the general case but practically very fast, becomes applicable. And it is possible that for our specific problem, these exponential cases don’t exist as well.

Inspired by this last point, by observing what the simplex algorithm “steps” translate to in our problem space, in terms of sets and fees and feerates rather than matrix row-operations, removing some apparently useless transitions, and observing that it effectively finds a full linearization rather than just a single topologically-valid highest-feerate subset, we obtain the following:

Input: n transactions with fees and sizes, and m dependencies between them (as (parent, child) pairs).
Output: a list of sets (the chunks), forming a graph partition, with all the consecutive highest-feerate topologically-valid subsets of what remains after the previous ones.
Algorithm:
- For every dependency, have a boolean “active”; initially all dependencies are inactive.
  - These partition the graph into components (when two transactions are reachable from one another by travelling only over active dependencies, up or down, they are in the same component).
  - Thus, the initial state has every transaction in its own singleton component.
  - As an additional invariant, no (undirected) cycles of active dependencies are allowed, so the component’s active dependencies form a spanning tree for that component. The entire state can thus be described as an (undirected) spanning forest, which is a subgraph of the problem graph.
- Keep performing any of the following steps as long as any apply:
  - If a dependency is inactive, and is between two distinct components, and the “child” component has higher feerate than the “parent” component, make it active.
  - If a dependency is active, and making it inactive would split the component it is in in two components (due to no-cycles property, this is true for every active dependency), where the parent component has higher feerate than the child component, make it inactive.
- Finally, output all the component in decreasing feerate order.

This appears to be very fast in practice, and easy to implement. Further, it can be proven that if it terminates, the result is indeed an optimal linearization. However, we don’t have a proof it always terminates, and certainly no bound on its runtime. A lot seems to depend on the policy on how to pick which dependency to make active or inactive in every step, but this needs a lot more investigation.