64 bit arithmetic soft fork

ajtowns · February 4, 2024, 7:30am

If your goal is to allow varying size big nums so that you can do maths in the secp scalar field, you don’t want to enforce minimaldata – otherwise if you take the hash of a blockheader to generate a scalar, and naturally end up with leading zeroes, you have to strip those zeroes before you can do any maths on that hash. Stripping an arbitrary number of zeroes is then also awkward with minimaldata rules, though not impossible.

(I think a minimaldata variant that only applies to values pulled directly from the witness stack, but not from program pushes or the results of operations) would be interesting)

dgpv · February 4, 2024, 7:43am

if there is FROMFIXNUM opcode that takes the size of fixed-size integer as one argument and byte blob of that size as another argument and turns it into a variable-size integer it will be easy (and also BYTEREV to deal with endianness)

Chris_Stewart_5 · February 12, 2024, 3:02pm

If anyone is interested to see how this set of proposals could be used by other op codes please see my new post about OP_INOUT_AMOUNT

Here is a direct link to a few test scripts in the python test framework

halseth · February 27, 2024, 2:12pm

Nice!

How would the IN_OUT opcode change if one decided to go with 64-bit ScriptNum type encoding instead of the LE variant?

I guess not a large change since you would just pushed a ScriptNum encoded integer on the stack instead?

  CScriptNum bn(fundingAmount);
  stack.push_back(bn.getvch());

sjors · February 28, 2024, 10:22am

Another metric to look at is who’s paying for those extra bytes. So for a set of typical transactions (in existence now and expected in the future), how much do those increase?

Chris_Stewart_5 · February 28, 2024, 1:27pm

I’m going to code this up to confirm ergonomics - so mistakes are likely in this post. Call them out if you see them. Here is my understanding without actually writing the code yet

If we were to continue with CScriptNum, as my OP_INOUT_AMOUNT implementation works currently

Read int64_t representing satoshis from BaseTransactionSignatureChecker.GetTransactionData()
Convert the int64_t into a minimally encoded CScriptNum. I don’t think this necessarily has to be done by an op code, could be done in the impl of OP_INOUT_AMOUNT itself
Call CScriptNum constructor, modifying the nMaxNumSize parameter to support 8 bytes.
Push CScriptNum onto stack
Wherever the satoshi value on the stack top is consumed by another op code, we need to figure out how to allow for nMaxNumSize to be 8 bytes.

As an example for step 5, lets assume we are using THE OLD (pre-64bit) numeric op codes

You see we interpret the stack top as CScriptNum, however that CScriptNum has a nMaxNumSize=4 rather than 8 bytes. This leads to an overflow exception being thrown by CScriptNum. This same problem applies to any opcode (another example is OP_WITHIN) that uses CScriptNum to interpret the stack top.

github.com

Christewart/bitcoin/blob/c617c5c3b0d21499b196184b5279b45627060cb5/src/script/interpreter.cpp#L983


      
          case OP_LESSTHAN:
          case OP_GREATERTHAN:
          case OP_LESSTHANOREQUAL:
          case OP_GREATERTHANOREQUAL:
          case OP_MIN:
          case OP_MAX:
          {
              // (x1 x2 -- out)
              if (stack.size() < 2)
                  return set_error(serror, SCRIPT_ERR_INVALID_STACK_OPERATION);
              CScriptNum bn1(stacktop(-2), fRequireMinimal);
              CScriptNum bn2(stacktop(-1), fRequireMinimal);
              CScriptNum bn(0);
              switch (opcode)
              {
              case OP_ADD:
                  bn = bn1 + bn2;
                  break;
          
              case OP_SUB:
                  bn = bn1 - bn2;

Chris_Stewart_5 · March 19, 2024, 2:07pm

Hi everyone, we have a bitcoin core pr review club tomorrow (3/20/2024) at 17:00 UTC for this implementation.

Happy to answer any questions about the PR, and talk through the ‘Design Questions’ section that I think still have open questions. Hope to see some of yall tomorrow at 17:00 UTC.

Chris_Stewart_5 · June 2, 2024, 4:50pm

Hi everyone, a new update:

This version of 64bit arithmetic

Removes 64bit specific opcodes in favor repurposing existing opcodes (OP_ADD64 → OP_ADD, OP_SUB64 → OP_SUB, OP_MUL64 → OP_MUL, OP_DIV64 → OP_DIV etc).
Every opcode in interpreter.cpp that use to accept a CScriptNum input now accepts a int64_t stack parameter. For instance, OP_1ADD accepts a int64_t stack top argument, and pushes a int64_t back onto the stack along with a bool indicating if the OP_1ADD execution was successful.
Removes casting opcodes (OP_SCRIPTNUMTOLE64, OP_LE64TOSCRIPTNUM, OP_LE32TOLE64)

I think this PR provides for a better developer experience as Script developers

No longer have to think about which opcode to use (OP_ADD or OP_ADD64, OP_LESSTHAN or OP_LESSTHAN64, etc)
No longer have to worry about casting the stack top with previous casting op codes (OP_SCRIPTNUMTOLE64, OP_LE64TOSCRIPTNUM, OP_LE32TOLE64)

This 64bit implementation would mean existing Scripts that typically use constant numeric arguments – such as OP_CHECKLOCKTIMEVERIFY/OP_CHECKSEQUENCEVERIFY would need to be rewritten to pass in 8 byte parameters rather than 5 byte parameters.

This PR heavily relies on pattern matching on SigVersion to determine what the implementation of the opcode should do.

For instance, here is the implementation of OP_DEPTH

github.com

Christewart/bitcoin/blob/019907039c7d342f3c1fe3c7f3dd9db879661c9e/src/script/interpreter.cpp#L1185


      
          {
              // (x - 0 | x x)
              if (stack.size() < 1)
                  return set_error(serror, SCRIPT_ERR_INVALID_STACK_OPERATION);
              valtype vch = stacktop(-1);
              if (CastToBool(vch))
                  stack.push_back(vch);
          }
          break;
          
          case OP_DEPTH:
          {
              switch (sigversion)
              {
                  case SigVersion::BASE:
                  case SigVersion::WITNESS_V0:
                  case SigVersion::TAPROOT:
                  case SigVersion::TAPSCRIPT:
                  {
                      // -- stacksize
                      CScriptNum bn(stack.size());

In the future, if we want to redefine semantics of OP_DEPTH we can now pattern match on the SigVersion and substitute the new implementation.

I think this provides us a nice framework for upgrading the interpreter in the future. We will have the compiler give errors in places where we aren’t handling a new SigVersion introduced in the codebase (exhaustiveness checks), and force us to handle that case.

ajtowns · June 2, 2024, 11:29pm

FWIW, the big concern I have with this is people writing scripts where they don’t think an overflow is possible, so they just do an OP_DROP for the overflow indicator, and then someone thinks a bit harder, and figures out how to steal money via an overflow, and then they do exactly that. That’s arguably easily mitigated: just use OP_VERIFY to guarantee there wasn’t an overflow, but I noticed that an example script in review club used the more obvious DROP:

<Chris_Stewart_5> Script: 0x000e876481700000 0x000e876481700000 OP_ADD64 OP_DROP OP_LE64TOSCRIPTNUM OP_SIZE OP_8 OP_EQUALVERIFY OP_SCRIPTNUMTOLE64 0x001d0ed902e00000 OP_EQUAL

Worries me a bit when the obvious way of doing something (“this won’t ever overflow, so just drop it”) is risky.

You could imagine introducing two opcodes: “OP_ADD64” and “OP_ADD64VERIFY” the latter of which does an implicit VERIFY, and hence fails the script if there was overflow; but that would effectively be the existing behaviour of OP_ADD. So I guess what I’m saying is: maybe consider an approach along the lines that sipa suggested:

where you change ADD to work with 64bit numbers (in whatever format), and add a new ADD_OF, MUL_OF (here OF implies “flag on stack” instead of “link in bio”)

Chris_Stewart_5 · June 3, 2024, 11:28am

Unfortunately as language designers we can only give people tools to build safe programs, we can’t force them to always use those tools correctly. There will always be developers that figure out innovative ways to unsafely use your PL.

For those that want to be writing safe programs, we now have the tools available to them do so. Those don’t exist at the moment.

Chris_Stewart_5 · June 18, 2024, 12:05pm

Here is the CScriptNum extension prototype (PR against master):

Highlights

Re-enables the OP_MUL and OP_DIV opcodes
Support 8 byte computation w/ arithmetic and comparison op codes.
Doesn’t add any new opcodes, rather repurposes existing opcodes based on SigVersion.
Changes the underlying impl type in CScriptNum from int64_t → __int128_t
Preserves behavior of the old CScriptNum (variable length encoding, allows overflow results, but no computation on overflowed results).

For those that may not be familiar with the existing behavior of CScriptNum, I would suggest reading this comment in script.h

/**
 * Numeric opcodes (OP_1ADD, etc) are restricted to operating on 4-byte integers.
 * The semantics are subtle, though: operands must be in the range [-2^31 +1...2^31 -1],
 * but results may overflow (and are valid as long as they are not used in a subsequent
 * numeric operation). CScriptNum enforces those semantics by storing results as
 * an int64 and allowing out-of-range values to be returned as a vector of bytes but
 * throwing an exception if arithmetic is done or the result is interpreted as an integer.
 */

The intention of this branch is to retain this behavior, only extending the supported range of values to [-2^63 +1...2^63 -1].

Consensus risk

This PR changes the behavior of CScriptNum. For instance, the constructor for CScriptNum now takes a __int128_t as a parameter rather than int64_t. This constructor is called for all SigVersion (not just SigVersion::TAPSCRIPT_64BIT). This seems like it could lead to some consensus risk with old nodes if someone crafts a specific transaction using segwit v0 or tapscript that exceeds std::numeric_limits<int64_t>::max() but is less than std::numeric_limits<__int128_t>::max().

64 bit arithmetic soft fork

Highlights

Consensus risk

More problems with __int128_t

More problems with `__int128_t`