Merging incomparable linearizations

sipa · November 26, 2023, 9:26pm

If the “unicity of corresponding chunking” theorem holds (which seems fairly obvious to me, though I lack the formality to prove it), then this also holds:

C(A + B) = C(A + C(B)) \geq A + C(B)
C(A + B) = C(C(A) + B) \geq C(A) + B

ajtowns · November 27, 2023, 7:15am

Oh, this gets easier if you split C into two steps: one to raise List Tx into List (List Tx) (by putting every element in a singleton list), call it c=R(s), and the other that repeatedly merges adjacent chunks when they’re out of order and gives you the best chunking, b=C(c)

Then you have C(a) \ge a fairly straightforwardly (hopefully?), and also C(a+b) = C(C(a) + b) = C(a+C(b)) = C(C(a)+C(b)) directly from merge order independence.

Also, I think that means you’re always comparing chunks (List Tx) by feerate, and chunkings (List (List Tx)) by diagram.

sipa · November 27, 2023, 12:29pm

I was thinking of C as an operation on a list of (generalized, not necessarily monotonically decreasing feerate) chunks - whether those are obtained by starting with the singletons from a linearization, or something else.

We want to prove that C(\epsilon_n + \delta_n) \geq \gamma_n I think. I suspect it’s possible to adapt your derivation above to work by placing C() invocations carefully, and using the rules around it.

sipa · November 28, 2023, 12:26am

EDIT: this theorem is insufficient, as it doesn’t allow for cases where the moved good transactions get chunked together with bad transactions; a scenario that is possible in the prefix-intersection merging algorithm.

Theorem

Given the following:

A linearization L
A topologically valid subset T of the transactions in L (which we’ll call “good” transactions).
The good transactions merge into a single chunk of feerate f, when considered in the order they appear in in L.
The highest chunk feerate of the bad transactions, again retaining the order of L, does not exceed f.

Under those conditions, moving the good transactions to the front of the linearization (leaving the internal order of good transactions and that of bad transactions unchanged) results in a better or equivalent linearization (by becoming the new first chunk, of feerate f).

Proof

Observe that, among the set of linearizations for the same transactions which leave the internal orderings unchanged, none can give rise to a chunk feerate exceeding f. If it did, there must be some topologically valid subset v of transactions, consisting of a union of a prefix of the good transactions and a prefix of the bad transactions, whose combined feerate exceeds f. No prefix of the bad transactions can exceed feerate f, thus there must be a prefix of the good transactions that does. This is in contradiction with the given fact that all the good transactions merge into a single chunk of feerate f.

Further observe that all good transactions merging into a single chunk of feerate f implies that any suffix of good transactions has feerate \geq f.

To prove that moving all good transactions to the front of the linearization always results in a better or equivalent linearization as the original, we will show that the following algorithm never worsens the diagram, and terminates when all the good transactions are in front:

Compute the chunking (c_0, c_1, \ldots, c_n) of the current linearization, merging equal-feerate chunks (this does not break the monotonic decrease property, and guarantees each c_i has distinct feerate).
Find the last chunk index t such that c_t \cap T \neq \emptyset.
Modify the linearization by reordering the c_t transactions such that all of c_t \cap T are at the start of c_t.
Repeat (which implies rechunking).

To show that this never worsens the diagram, it suffices to see that reordering transactions within a chunk at worst leaves the chunk unchanged, and at best splits it, which improves the diagram.

Further, this algorithm makes progress as long as c_t does not start with all its good transactions. As long as that is the case, a nonzero amount of progress towards having all the good transactions at the start of the linearization is made. Thus, after a finite number of such steps, that end state must be reached.

It remains to be shown that c_t cannot start with all its good transactions unless the end state is reached. Assume it does start with all its good transactions. The set c_t \cap T is a suffix of the good transactions, and thus has feerate \geq f. Since c_t starts with c_t \cap T, the chunk c_t itself must have feerate \geq f. However, every chunk must have feerate \leq f. The combination of those two means c_t has exactly feerate f. There can only be one such chunk (as all c_i have distinct feerates), and it must be the first one (because no higher feerate is possible), so t=0. In this case, we are in the end state, because the first chunk is the only one with good transactions, and all good transactions are at its front.

ajtowns · November 28, 2023, 8:12am

I don’t like that assumption; you can only check it after you’ve done all the work, rather than beforehand, and it conceivably could turn out to be false. I think a better assumption would be “The first chunk of L has a feerate f_0 which does not exceed f”

I think by “topologically valid” you’re meaning that no parents follow their child, and that all parents are included; ie if you have c spends p spends gp, then [gp, p, c] and [gp, p] are topologically valid, but [p,c] is not. This lets you say that if b is topologically valid, then for any a,x, if a + b is topologically valid, then b + a is also topologically valid. You also have a + b being t.v implies a is t.v, and that gives you b_1 + b_2 + .. + b_n being t.v and a_0 + b_1 + .. + b_n + a_n being t.v gives b_1 + .. + b_n + a_0 + .. + a_n being t.v.

For the intermediate steps, you’re not moving txs all the way to the front, though, so I think you want something slightly cleverer still; perhaps a + b + c and a + c being t.v gives a + c + b being t.v is enough.

I think you could rewrite this slightly:

Chunk to (c_0, .., c_n) normally, pick t.
Note that fee rate of c_0 \le f (is it?) and the feerate of c_i \ge c_{i+1} as a property of chunking.
Construct c^\prime_t by reordering c_t to ensure the good txs are at the start.
If c_{t-1} has feerate less than f or c_t \cap T \ne T then the good txs at that start of c^\prime_t will have higher feerate than c_{t-1} (because every tail of T has higher feerate than f) and the final good txs will appear in chunk t-1 or lower on the next round.
Otherwise, the good txs at the start of c^\prime_t are precisely T, those txs and each of c_0, .., c_{t-1} have feerate f (as c_0's feerate is f_0 \le f by assumption, and by feerate c_i \ge c_{i+1} due to the chunking algorithm, and T has feerate f by definition). But in that case reordering to be T, c_0, .., c_{t-1} doesn’t change the diagram, because all the reordered chunks have precisely the same feerate.

Note that your good txs will never get split up once you’ve moved them, so c_0 will have the same composition and feerate it had originally until t=0 and c_0 \cap T = T, and merging some subset of the good txs into c_{t-1} will mean all of them are merged.

Also, I think your theorem needs a tweak: if you have equal size txs with feerates a=0,b=6,c=9,d=0,e=5, and start with L=[a,b,c,d,e] and T=[a,e], then your original chunks were [a,b,c], [d,e] with both those chunks and T having a feerate of 5. But your new linearisation of [a,e,b,c,d] chunks to [a,e,b,c] [d] with feerates 6.25 and 0. Which is fine, it’s just that T doesn’t actually become the first chunk, since we end up finding a chunk with feerate greater than f instead.

sipa · November 28, 2023, 1:13pm

It can be checked beforehand: it’s just the feerate of the first chunk of L \setminus T is \leq f.

Hmm, indeed, that would be a better starting point, as it exactly matches the conditions in the actual prefix-intersection merging algorithm. And it’s not equivalent to mine (as your example at the end of your post shows; in that example the highest chunk feerate of bad transactions is 7.5 which exceeds f=5). I’ll think about generalizing it.

Indeed, something like this needs to be included in the proof. Also further, it’s currently unclear why it’s even allowed to move the c_t \cap T transactions to the front of c_t.

I don’t think you need an argument about what happens to c_t after rechunking. It’s sufficient that progress was made by moving a nonzero number of transactions towards the front. If you can keep doing that, all of them will end up at the front of L. And we prove progress will keep being made until they are all at the front of L, so that is indeed the end state.

Note that I started off by merging equal-feerate chunks, so there can be at most one chunk for any given feerate.

Right, this example does capture a situation we want the theorem to cover, but it currently doesn’t.

sipa · November 28, 2023, 8:14pm

Sadly, the proof breaks entirely when chunks with feerate > f are permitted. It is possible, by following the “move good transactions to front of last chunk that has any” algorithm, to end up in a situation that’s better than the desired end state (all good transactions up front). The theorem is still likely true, but the proof technique of individual move steps to achieve it does not work, unless it can somehow be amended to never improve beyond the goal.

sipa · November 29, 2023, 2:40am

New attempt, with a fully “geometric” approach to show the new diagram is above the old one, but largely using insights about sets from @ajtowns’s proof sketch above.

Theorem

Given:

A linearization L
A topologically valid subset G of the transactions in L (“G” for “good” transactions).
Let L_G be the linearization of the transactions G, in the order they appear in L.
Let L_B be the linearization of the transactions not in G (“bad”), in the order they appear in L.
Let f be the highest chunk feerate of L.
L_G (considered in isolation) must form a single chunk, with feerate \geq f.

Then:

L_G + L_B is at least as good as L (better or equal feerate diagram, never worse or incomparable).

Let’s call this the gathering theorem (it matches what the VGATHER* AVX2 x86 CPU instructions do to bitsets).

Proof

Let S(x), F(x), R(x) respectively denote the size, fee, and feerate of the set x.
Let (c_1, c_2, \ldots, c_n) be the chunking of L (the c_i are transaction sets).
We know R(c_i) \leq f for i=1 \ldots n, because of input assumptions.
Let \gamma_j = \cup_{i=1}^{j} c_i (union of all chunks up to j).
Let e_i = c_i \cap G for all i=0 \ldots n (the good transactions from each chunk).
Let \zeta_j = \cup_{i=j+1}^{n} e_i (union of all good transactions in chunks after j).
Because \zeta_j is a suffix of L_G, which is a single chunk with feerate \geq f, we know R(\zeta_j) \geq f for all j=0 \ldots n-1.
Let P(x) = (S(x), F(x)) be the point in 2D space corresponding to the size and fee of set x.
Let D be the diagram connecting the points (P(\gamma_j))_{j=0}^{n}, corresponding to the prefixes of chunks of L. This is the diagram of L.
Let N be the diagram connecting the points (0,0) followed by (P(\gamma_j \cup \zeta_j))_{j=0}^n, corresponding to the prefixes of L_G + L_B consisting of all good transactions and the bad transactions up to chunk j.
N is not the diagram of L_G + L_B because that requires rechunking that linearization. However, rechunking can only improve the diagram, so N is an underestimate for the diagram of L_G + L_B.
D is a concave function. A point lies strictly under a concave function iff it lies under every line segment that makes up D, extended to infinity. Thus, a point lies on or above D iff it lies on or above at least a single line that makes up D.

In the drawing below, the blue diagram formed using the \gamma_i+\zeta_i points (plus initial green \zeta_0 segment) is an underestimate for N, the diagram of L_G+L_B. The red diagram shows D, the diagram for L. Intuitively, the blue diagram lies above the red diagram because the slope of all the green lines is never less than that of any red line:

In what follows, we will show that every point on N lies on or above a line making up D. If that holds, it follows that N in its entirety lies on or above D:

The first diagram segment of N, from (0,0) to P(\gamma_0 \cup \zeta_0), is easy: it corresponds to L_G itself, which has feerate (slope) \geq f. All of these points lie on or above the first line of D, from P(\gamma_0) (= (0,0)) to P(\gamma_1), which has feerate (slope) f.
For all other diagram segments of N, we will compare line segments with those in D with the same j = 0 \ldots n-1. In other words, we want to show that the line segment from P(\gamma_j \cup \zeta_j) to P(\gamma_{j+1} \cup \zeta_{j+1}) lies on or above the line through P(\gamma_j) and P(\gamma_{j+1}).
- P(\gamma_j \cup \zeta_j) lies on or above the line through P(\gamma_j) and P(\gamma_{j+1}) iff R(\zeta_j) \geq R(c_{j+1}). This follows from R(\zeta_j) \geq f and R(c_{j+1}) \leq f.
- P(\gamma_{j+1} \cup \zeta_{j+1}) lies on or above the line through P(\gamma_j) and P(\gamma_{j+1}) iff R(c_{j+1} \cup \zeta_{j+1}) \geq R(c_{j+1}). This follows from the f relations too, except for j=n-1, but there the endpoints of the two graphs just coincide.
- All points between P(\gamma_j \cup \zeta_j) and P(\gamma_{j+1} \cup \zeta_{j+1}) lie on or above the line through P(\gamma_j) and P(\gamma_{j+1}), given that the endpoints of the segment do.

Thus, all points on an underestimate of the diagram of L_G + L_B lie on or above the diagram of L, and we conclude that L_G + L_B is a better or equal linearization than L.

Generalization

It is easy to relax the requirement that L_G forms a single chunk. We can instead require that only the lowest chunk feerate of L_G is \geq f. This would add additional points to N corresponding to the chunks of L_G, but they would all lie above the existing first line segment of N, which cannot break the property that N lies on or above D.

sipa · November 29, 2023, 10:13pm

Merging individually post-processed linearizations (thus, ones with connected chunks) may result in a linearization whose chunks are disconnected. This means post-processing the merged result can still be useful.

For example:

graph BT
   T0["B: 5"];
   T1["C: 10"] --> T4;
   T2["D: 2"] --> T4;
   T3["E: 2"] --> T0;
   T3 --> T4;
   T4["A: 1"];

L1 = [B,A,E,C,D] (chunked as [B,AEC,D])
L2 = [B,A,D,C,E] (chunked as [B,ADC,E])
Prefix-intersection merge yields [B,A,C,D,E] (chunked as [BAC,D,E])
Post-processing the merge gives [A,C,B,D,E] (chunked as [AC,B,D,E])

ajtowns · November 30, 2023, 6:45pm

I think the intuition for this is that you’re constructing a (possibly non-optimal) feerate diagram for L_G+L_B by taking the subsets L_G + \sum_{i=1}^j d_i and showing this is a better diagram than that of L. Because you can’t say anything about d_i (the bad parts of each chunk), you’re instead taking the first j chunks as a whole, which have a feerate of f or lower, and then the suffix of L_G that was missed, which has a feerate of f or high, which kinks the diagram up. Kinking the diagram up means this subset has a better diagram than the chunks up to and including c_j.

FWIW, I think this was the argument I was trying to make in my earlier comment but couldn’t quite get a hold of.

I think the general point is just: L^* \ge L if (and only if?) for each \gamma_i (prefixes matching the optimal chunking of L) there exists a set of txs \zeta_i where \gamma_i \cup \zeta_i is a prefix of L^* and R(\zeta_i) \ge R(\gamma_i).

Ah! Actually, isn’t a better “diagram” to consider the ray from the origin to these P(\gamma_i) points in general, ie, for x=0...c_1 plot the y value as R(\gamma_1), for x=c_1..c_2 plot R(\gamma_2) etc. Then (I think) you still compare two diagrams by seeing if all the points on one are below all the points on the other; but have the result that for an optimal chunking, the line is monotonic decreasing, rather than concave.

Also, with that approach, to figure out the diagram for the optimal chunking, you just do the diagram for every tx chunked individually, it’s just f_{opt}(x) = \max(f(x^\prime), x^\prime \ge x).

sipa · November 30, 2023, 7:30pm

Exactly. I’ve added a drawing to illustrate this, I think.

Do you perhaps mean plotting F(\gamma_i) (fee) instead of R(\gamma_i) (feerate)?

If so, that’s always monotonically increasing (just because fees are never negative).

ajtowns · November 30, 2023, 7:51pm

FWIW, I think you don’t need case \gamma_0 + \zeta_0 in your proof, just the ones after each chunk suffice.

No, I meant fee rate – plotting fee gives you the current diagram, eg:

Plotting fee rate looks like:

Spreadsheet to play around: cfeerate - Google Sheets

sipa · November 30, 2023, 7:58pm

I think that’s a question of formality. It’s of course obvious that if all points after each chunk lie above the old diagram, then the line from the origin to the first point does too. Yet, we actually do require the property that that line in its entirety lies above the old diagram too. Depending on how “obvious” something needs to be before you choose not to mention it, it could be considered necessary or not.

~~I don’t think this works. It is very much possible to construct a (fee) graph N that lies above graph D everywhere, but whose derivatives (feerates) aren’t.~~

EDIT: I see, you’re talking about average feerate for the entire set up to that point, not the feerate of the chunk/section itself. Hmm. Isn’t that just another way of saying the fee diagram is higher?

ajtowns · November 30, 2023, 8:12pm

No, I mean that you can just choose the origin for your first point and go straight to \gamma_1 \cup \zeta_1 when constructing N, and run the same logic.

Yes, precisely – it is another way of saying the same thing, I think it’s just easier to reason about (horizontal line segments that decrease when you’ve found the optimal solution).

Translating the N is better than L proof becomes: look at the feerate at \gamma_i; now consider the feerate for \gamma_i + \zeta_i – it’s higher, because \zeta_i \ge f and the size of that set is greater, and an optimal chunking will extend that feerate leftwards, so N chunks better up to i, for each i, done. You don’t have to deal with line intersections.

sipa · November 30, 2023, 10:59pm

Right, that works, we could just drop \gamma_0 \cup \zeta_0 from N. I don’t think it actually simplifies the proof though, because the reasoning for the first segment (now from (0,0) to P(\gamma_1 \cup \zeta_1)), would still be distinct from that for the other segments.

Consider the following:

3 transactions A (fee 0), B (fee 3), C (fee 1), all the same size.
L_1 = [A,B,C], chunked as [AB=3/2, C=1/1]
L_2 = [B,A,C], chunked as [B=3/1, AC=1/2]
L_2 is a strictly better linearization, the fee-size diagram is higher everywhere.
The feerate (using the R(\gamma_i) method) for the middle byte is 3/2 for L_1, but 4/3 for L_2. Would they be considered incomparable by the feerate-size diagram?

ajtowns · December 1, 2023, 12:29am

Yeah, agreed; was more just thinking from the general case where \zeta_i might not have the same nice pattern.

Hmm, you’re right, my simplification to a flat graph doesn’t seem to do anything useful.

[EDIT: added the emoji as a possible post reaction]

sipa · December 3, 2023, 5:34pm

Interesting, it appears that prefix-intersection merging is not associative.

That is, given 3 linearizations A, B, C of a given cluster it is possible that not all three of these are equally good:

merge(merge(A, B), C)
merge(merge(A, C), B)
merge(merge(B, C), A)

This holds even when the inputs, intermediary results, or outputs (or any combination thereof) are post-processed.

sipa · December 3, 2023, 7:44pm

Prefix-intersection merging can be simplified to the following:

Given two linearizations L_1 and L_2
(Optionally) Output any prefix of both that’s identical.
Let P_i be the highest-feerate prefix of L_i, for i \in {1,2}.
Assume without loss of generality that R(P_1) \geq R(P_2) (swap the inputs if not).
Let C be the highest-feerate prefix of L_2 \cap P_1.
Output C, remove it from L_1 and L_2, and start over.

So we only need to find the best prefix of the intersection of the highest-feerate prefix with the other linearization, and not the other direction.

This does not break the “as good as both inputs” proof, and from fuzzing it appears it that this doesn’t worsen the result even (it could be that this results in worse results while still being at least as good as both inputs, but that seems not to be the case).

I don’t have a proof (yet) that the algorithms are equivalent, but given that it’s certainly correct still, it’s probably fine to just use this simpler version?

ajtowns · December 4, 2023, 6:41am

If you have L1=[5,3,1,8,0] (chunk feerates: 5, 4, 0) and L2=[0,8,1,3,5] (chunk feerates: 4, 3) then bestPi chooses C=L2 \cap [5] to start with, ending up as [5,8,3,1,0]. Whereas calculating and comparing C_1 and C_2 would produce [8,5,3,1,0].

But post-processing probably fixes that for most simple examples at least?

(Even if post-processing didn’t fix it; I think so long as merging produces something at least as good as both, that’s fine; quicker/cheaper is more important)

ajtowns · December 4, 2023, 8:44am

I think you can make post-processing fail to help for this example just by adding a small, low-fee parent transaction P that is a parent to both the 8-fee and 5-fee txs.

I think if you have five transactions, A,B,C,D,E, of 10kvB each at 50,30,10,80,0 sat/vb, and one transaction, P, at 100vB and 0 sat/vb that each of A,D spends an output of, with the same arrangement as above, then post-processing doesn’t fix it either?

L1 = [P,A,B,C,D,E]; chunks as PA: 49.5, BCD: 40, E: 0
L2 = [P,E,D,C,B,A]; chunks as PED: 39.8, CBA: 30
P1 > P2; L_2 \cap P_1 = PA, C=PA
repeat: P1=BCD, P2=ED, equal feerates; L_1 \cap P_2=DE and L_2 \cap P_1 = DCB, so C=D in either case.
repeating gives B then C then E
result is L=PADBCE

Post-processing (work marked with *):

[*P]
[P,*A], [*PA]
[PA,*D], [*PAD]
[PAD,*B]
[PAD,B,*C]
[PAD,B,C,*E]