CTV vault output descriptor

Given the renewed effort to prioritise OP_CTV (BIP 119) over other soft forks that can enable vaults (e.g. OP_CCV) and the withdrawal of OP_VAULT, it’s useful to revisit how one can build vaults with OP_CTV.

@jamesob designed such a vault a few years ago:

OP_CTV allows the vault strategy to be used without the need to maintain critical presigned transaction data for the lifetime of the vault, as in the case of earlier vault implementations. This approach is much simpler operationally, since all relevant data aside from key material can be regenerated algorithmically. This makes vaulting more practical at any scale.

It could be modernised to make use of ephemeral (dust) anchors (see BIP 341), and to use taproot script paths rather than OP_IF.

But for any vault construction to be useful in wallets, it needs to either fit in the existing BIP 380 output descriptor paradigm or develop an alternative.

In the context of OP_CHECKCONTRACTVERIFY (OP_CCV, BIP 443) @salvatoshi wrote:

My general take is that descriptors are the wrong tool for this purpose: a spend from UTXO X to UTXO Y where f(Y) = X needs to somehow encode the relation between X and Y as a predicate . While you could do that (every program is a predicate, and every program expressible in Script is a predicate that can be expressed in a tree structure like miniscript…) it quickly becomes unmanageable.

Perhaps this reasoning applies to OP_CTV as well. But given that it’s less powerful, perhaps for a simple use case like vaults it can still work? It would be more likely to gain adoption given existing infrastructure.

Let’s take a simple 2-of-2 multisig, where after an N block delay either party can spend the coins.

This can already be done, e.g. with Alice (A) and Bob (B):

tr(musig(A, B), {and_v(pk(A), older(N)),...})

However this requires Alice and Bob to move their coins at least every N blocks. Higher values of N means fewer such movements, but also a slower recovery time. Additionally BIP 68 limits relative time and height locks to about a year, so coins have to be rotated at least that often.

This is where a (CTV) vault can come in handy.

The key path remains musig(A, B), but the script paths allow either party, e.g. Alice, to move coins into a vault. After N blocks Alice can move the coins using a pre-determined key under her control (e.g. just A).

If Alice believes her key was compromised, or if Bob didn’t actually lose his key, either of them can immediately and without a signature send the coins back. Back where though?

With CCV the vault could be recursive, but afaik not with CTV. That’s ok because we wouldn’t want to go in circles anyway. Instead the fallback could be the design we started out with: tr(musig(A, B), {and_v(pk(A), older(N)),...}). That gives both parties N blocks time to sort things out between them, after which either side can move the coins. Including the delay of the vault itself, they have 2 * N blocks time.

So what would this look like as descriptor? How about:

tr(musig(A, B),{and_v(pk(A),ctv(musig(A, B), {unvault_cold, unvault_hot}), ...})

Where:

  • unvault_hot is and_v(older(N), pk(A))
  • unvault_cold is ctv(musig(A, B), {and_v(pk(A), older(N)),...})

Unvaulting back to cold can be done without any signature, hence the nested ctv(), which is good when Alice and/or Bob are racing an attacker. But if they have some extra time, they could double-spend the unsigned unvault using the musig(A,B) keypath and send coins straight to where they want it (even back to the original vault).

The ctv fragment has the same syntax as the tr() descriptor and would have very limited functionality here. It’s just the key path and list of script paths that the committed transaction must send to. Perhaps it should be called vault().

Assuming the above makes any sense, it’d love to see someone implement it…

2 Likes

Cc: @shesek

FTR, I am slowly warming to @salvatoshi’s view that descriptors might be the wrong language to play with this. But I’m still exploring the idea as mentioned in the post. Descriptors must convey all the information necessary to spend the output.

CTV spec specifies a bunch of other fields that are going to be hashed. For example, we need to decide what sequence value to set for the transaction, as well as the complete serialization of all inputs and outputs. If we don’t store all of this information, then we don’t satisfy the property that descriptors must contain all the information required to spend from it.

One naive attempt could be to encode the entire transaction as hex in the descriptor. But we still want the flexibility to express BIP32 keys in there. Maybe if we had a new ctv_tx fragment in the descriptor language that looks like:

ctv_tx(version, nlocktime, inputs_hash, [out_desc1, out_desc2, ...])

where each out_desc_i is a BIP380 descriptor in itself, that would be a complete specification. This is clearly clunky, and maybe with some work we can simplify the fragment for common use cases. But in any such simplification, we must preserve the property that the descriptor contains all information required to spend the output. (minus the private key of course).

Happy to try to do this once I get some spare time

I think you’d need to include amounts for each output, in addition to a descriptor that can produce the scriptPubKey? You’d probably want some other expression that can convert total_in_amount_pct(50) to 50% of the sum of all the input amounts or similar.

1 Like

I agree for the general case. However it’s been about 6 years since outputs descriptors were proposed for Bitcoin Core and we’re just barely getting around to proper MuSig2 support. Inventing a whole new paradigm probably means we won’t have working vaults (in Bitcoin Core) for a decade. (to be fair, there were other factors contributing to the long timeline, and with more developers involved it might go faster next time)

Perhaps this requirement can be relaxed to allow looking up information in the blockchain. In Bitcoin Core at least the implementation of descriptors comes with a cache. Right now that’s a simple cache of just the derived keys at each index, as well as the most recently used height.

So the wallet could look for coins that match the top level descriptor, then detect spends that use the ctv() branch and reconstruct everything needed for the CTV hash. For that reconstruction to work we’d need some opinionated defaults, e.g. the nlocktime needs to match the nlocktime of the original transaction, or the ctv() fragment has an optional locktime offset argument.

I think the most important requirement from a usability perspective is that you only to have to backup your descriptors once in the lifetime of a wallet. It’s fine if the wallet generates additional descriptors such as the ctv_tx() when it needs to, as long as those can be regenerated from a backup (and anything in the blockchain).

Thinking about it a bit more, it seems you can’t actually do what I’m describing here with CTV and it only works with CCV. In particular a descriptor can’t and shouldn’t commit to the amount that is sent to it, because the receiver doesn’t have control over that.

1 Like

In particular a descriptor can’t and shouldn’t commit to the amount that is sent to it, because the receiver doesn’t have control over that.

(sorry on reflection this 1st paragraph is me mostly repeating your point …)

That’s true but are you not here just pointing to a limitation of using CTV in any case? If your spending of a utxo is constrained in this way, you have to provide an amount which also has a constraint (obviously usually it’s the other way round - you first decide an amount, and then construct a ctv hash). It’s true that CTV supports multi-input so it’s certainly not as simple as “your utxo must have exactly x satoshis + fees for an output of value x”, but it’s also true that the BIP somewhat strongly discourages more than one input, anyway.

I guess it’s like, if we tend to think of descriptors as “a thing that describes an unboundedly large pot you can put stuff into”, then that doesn’t work here, these are not that. They are customized pots of fixed size created dynamically when you’ve already decided what you want to put in them.

(I guess technically these limitations are not only on size, but e.g. timelocking, since you have to commit to nLockTime in advance, also).

If I were trying to be constructive (but in a state of relative ignorance about descriptors), I’d say, the descriptor could have the preimage of the CTV hash all serialized. I’m not sure why that wouldn’t be the correct thing to do, albeit it might not be a “normal” descriptor.

Yes, if I understand this limitation correctly, and if there’s no elegant way around it, it makes CTV not practical for vaults. Except perhaps for very sophisticated users, in particular those with existing (reliable, frequent) backup infrastructure.

The use case I have in mind is a “daily” wallet with a recovery mechanism, where if you don’t use the mechanism it shouldn’t get in your way. Otherwise it’s not going to be a UX improvement over the current state of the art of just rotating coins once a year as a dead man’s switch (and having your wallet reminding you).

You can’t know this hash at wallet creation time. Indeed also because you won’t know what lock time to pick. Maybe you could have giant tree of amounts and fixed dates?


Maybe the above is a bit too pessimistic; it doesn’t work for the use case I’m describing, and descriptors aren’t the right tool, but the original Simple-CTV-Vault design seems to only require the user to backup:

  1. the involved public keys
  2. one (?) private key
  3. the block delay
  4. each deposit transaction id

(1) - (3) are known at creation time. (4) requires continuous backups, but may be recoverable with context.