I’m trying to do both, actually. I’m also charging a fixed validation weight.
For the prefix mode, the cache strategy would be to define some constant interval N (say 10) so that every N in/outputs the hash context for each field can be kept, making storage requirement O(#in/out) for the cache, while making the amount of data needed to hash on each occurrence be maximally N items.
I didn’t think about the scriptSigs specifically. They can indeed be of arbitrary size and that can be a problem. Are there any other fields that can realistically be set at arbitrary size within policy limits? For those, we could have a field-specific cache. For scriptSigs, this would mean we’d have to store an extra 32-byte hash for each input, which isn’t too bad. As most of scriptSigs are empty nowadays, the hash of the empty string can be cached so this would be free for anything segwitv0 and taproot.
Yeah I also read this shortly after sending my e-mail. That would probably not be a good idea. But like Russell also replied in that thread, it could be remedied by just thinking long enough about what fields to expose so we don’t need to redo them. This means that we might have one bit left in the current design to cover (or break up) something.
I could do that, yeah. I’m thinking mostly of Ark atm, but almost anything CTV does, TXHASH can do with added flexibility to add fees. I can also do a version of doubletake using OP_CHECKTXHASHVERIFY (that use case could be moved entirely to Bitcoin if we had CAT+CSFS).