BTC Lisp as an alternative to Script

roconnor-blockstream · March 15, 2024, 9:59pm

Last thread I commented on a high-level comparison between Chia Lisp and Simplicity. In this thread I want to comment on some lower-level comparisons between Chia Lisp and Simplicity.

I understand you are somewhat cool to the idea of using Simplicity, and I’d like to try to get you a little bit more excited about Simplicity. The use of combinators in Simplicity likely seems unfamiliar and in comparison lisp has a more familiar feeling and feels like it ought to be easier to write. But I want to argue that in an extremely broad sense, Simplicity, Chia Lisp, and even Bitcoin Script are all kinda the same.

I write about this in more detail at Is it theoretically possible to have a better Operational Semantics than the BitMachine · BlockstreamResearch/simplicity · Discussion #89 · GitHub but there are three fundamental ways of composing computations: sequential composition where the output of one composition is used in the input of another computation; parallel computation (parallel in the logical sense rather than necessarily an operational sense) where the same input is used by two different computations; and conditional composition where an input is sent to one of two or more alternative computations and the output of the chosen computation is returned. (There is a forth fundamental composition for computation which is looping, something that Chia Lisp can do, but neither Simplicity nor Bitcoin Script can do (though it can be kinda-faked with Simplicity)).

Bitcoin Script, with it’s various duping, swapping and op_if-ing, Simplicity, and Chia Lisp can all perform these three sorts of compositions and, in some sense, it is just a matter of how they end up being expressed.

As you noted before, in Chia Lisp, expressions are evaluated in the context of an environment, where data is stored in a binary tree that is accessed by numbers that index into that environment. Put this way, you can view a Chia Lisp expression as a function from the environment to some resulting computation on that environment.

The way that Simplicity’s combinators work is fundamentally the same. In Simplicity, an expression is a function from an input type to an output type. In practice that input type is almost always some nested product type, e.g. ((A × B) × (C × D)) whose value is an “environment” of values of data types A B C and D. In turn, a product of types in Simplicity is fundamentally the same as the cons cells in Lisp, it holds a pair of values, which in turn can hold other pairs of values.

In Simplicity, if you want to access part of the input, which you can think of as an environment, you use Simplicity’s take and drop combinators to access that. So if you want to fetch the B value from the previous “enviroment” you would use the Simplicity expression (take (drop iden)) which has type ((A × B) × (C × D)) → B. If you want to fetch the D value you use the simplicity expression (drop (drop iden)) of type ((A × B) × (C × D)) → D.

This take, drop, iden, idiom is so prevalent in Simplicity for “looking up values in the environment”, that I typically use a short hand where O means drop and I means take and H means iden. I can write these examples as OIH : ((A × B) × (C × D)) → B and IIH : ((A × B) × (C × D)) → D. The O’s and I’s are meant to evoke the idea that you have encoded an index in binary and are lookup up a value of that index from the environment.

So while Simplicity’s combinators are all formally defined in terms of combining functions, you can think of those functions as denoting “a value given an environment”.

For example in Chia Lisp, if you have two expressions L and R, and you build a cons expression from them, yielding (cons L R), you get a Chia Lisp expression that represents a function from an environment that passes that environment into L and into R, forming a pair of resulting two values. This is exactly the “parallel composition” described earlier. In Simplicity we have a pair combinator which is fundamentally the same. If you have two Simplicity expressions, s : A → B and t : A → C, then (pair s t) (also written as (s △ t)) is an expression of type A → (B × C), which takes an input of type A and passes it to s and t, and then pairs the two results. This is exactly the same.

Simplicity is comparable to the clvm. clvm expressions such as (a (q a 2 (c 2 (c 5 ()))) (c (q a (i (= 5 (q . 1)) (q 1 . 1) (q 18 5 (a 2 (c 2 (c (- 5 (q . 1)) ()))))) 1) 1)) are just as unreadable as simplicity expressions like drop (OOH △ IOH) △ (OH △ drop (OIH △ IIH) ; full-addₙ) ; IIH △ (IOH △ OH ; full-addₙ) ; IOH △ (IIH △ OH). In both cases expressions can be programed by hand, but they are not meant to be programmed by hand. They are meant to be the target language of a compiler from some human readable language and are instead mean to be easy to machine evaluate (which is why in the all the cases, Bitcoin Script, Simplicity and clvm, there are no variable names, as such, in the language). As you noted, you want to translate from a higher-level language to produce BTC lisp, and same goes for Simplicity where Sanket and now Chris have been working on a higher-level language (code named Simphony) that can be translated to Simplicity, but where the higher-level language is not itself consensus critical.

Even though I’m arguing that all these languages are, in a very broad sense, fundamentally the same, they are indeed different, and these differences are not necessarily immaterial, and are worth debating. There are lots of low level choices to be made: what fundamental data types are there going to be to handle, integers, digital signatures, elliptic curve points, compressed points, scalar curve values, hash values, strings, lock times, sequence numbers, optional values (and in the case of Elements additionally, asset ids, confidential values, range proofs …), how are they going to be represented, which ones can be converted to each other, what set of operations are going to be made available etc. There is a huge variety of choices that can be made and Simplicity and Chia Lisp have made quite different choices here.

Probably the most significant lowish level difference between Chia Lisp and Simplicity is that Chia Lisp is dynamically typed and Simplicity is statically typed. This difference likely leads other design choices mentioned before down different paths. I’m doubtful we are going to solve the computer science question of dynamic types vs static types in this thread. However I will mention that the choice static typing of Simplicity does have operational consequences. It means that, once a Simplicity program passes type checking (type checking runs in quasi-linear time in the size of the program), the Simplicity interpreter can run without bounds checking. (e.g. every “index” into “the environment” fetches its value without failure because otherwise the Simplicity program wouldn’t be well-typed). Theoretically, some sort of JIT Simplicity compiler implementation could potentially operate quite fast without safety concerns.

In conclusion, Simplicity and the clvm are both low level languages that are meant to be easy for machines to evaluate which causes tradeoffs that make them hard for humans to read. They are intended to be the compiled from some different, human-readable, non-consesnus-critical language. Simplicity and the clvm are different ways of expressing the same old things: fetching data from an environment, tupling up bits of data, running conditional statements, and a whole bunch of primitive operations of some sorts. Like clvm code, Simplicity code is pretty annoying to write by hand, and while no one else should be doing this, I have written, for example, an entire implementation of Schnorr signature verification on the secp256k1 curve, including its own implementation of libsecp256k1, in raw Simplicity. Of course, you would use a primitives instead (what I’d call jets in Simplicity), but the point is that you can indeed build programs of that sort of complexity in Simplicity.