OP_CHECKCONTRACTVERIFY and its amount semantic

Can you elaborate on why? Validity is necessarily meaningful only at transaction-level. So I don’t know what you mean by “spill-over” - these new validation rules are by definition at transaction-level, the same way that CHECKSIG is.

On the opposite direction, it might be interesting to attempt a refactoring/simplification of the validation code so that jobs in ConnectBlock receive transactions to validate, rather than input scripts. That’s where the implementation complexity of any kind of cross-input logic stems from (because of the added synchronization code), and I strongly suspect that input-level parallelism doesn’t bring any measurable improvement in performance. Of course, this would need to be validated with benchmarks.

Thanks for demonstrating the approach. It would of course work, although I do see some downsides, while it’s not too yet clear to me what are the benefits.

I think you can remove the optional amount from the constraint (which is not used in CCV, since in the deduct mode, the amount must equal exactly the output’s amount).

The annex is only enforced by a signature, but signatures already cover output Scripts and amounts. So this is only meaningful if the constraints are also enforced in full by the opcode (CCV or any other new opcodes that might want to use this feature). Therefore, the opcode has to repeat in full the constraint as it appears in the annex, doubling the number of bytes needed to express it. That can be reduced to just a couple of bytes per constraint, so maybe that’s not too bad for CCV in terms of byte cost.

Some care would also need to be taken to make sure that the annex is not malleable (as the input Script could use CCV without any signature).

As a side note, I think a solution using the annex would be very similar in implementation complexity to my previous attempt in this diff, on inquisition, based on the deferred checks framework from James O’Beirne’s OP_VAULT PR (particularly, this commit). While that avoids the explicit synchronization, it’s quite a bit more complex than the code based on the mutex.

For an implementation based on the annex, I’d expect very similar complexity, as what’s accumulated in the deferred checks is isomorphic to what would be in the annex in your approach.