# Cost Analysis of Smart Contracts via Parametric Resource Analysis

It is an honor to give this invited talk at SAS and opening the conference. Many thanks to Mihaela, David, and the PC members for inviting us. This is joint work by Victor, Pedro, Maximiliano, Jose, and myself, done at the IMDEA Software Institute and UPM, and the idea is to present a quick overview of our Parametric Resource Analysis approach, and report on some of our experiences applying it to some contexts, and in particular to developing cost analyses for smart contracts. We start with some introduction on the application. Smart contracts are used to perform secure transactions in a distributed setting without the need for third parties. Often contracts and their storage are replicated in every node and any call to a contract is executed on every client.

Because of this, platforms like Ethereum, Tezos, TRON, etc. introduce upper bounds on execution time and storage, and associate fees with running a contract or increasing its storage size. This is basically gas and storage costs. Typically each instruction of the contract language has an associated cost. Therefore, storage space and computation time are interesting resources to study in these platforms, among others. Knowing beforehand the cost of running a contract in terms of computation time, in terms of increasing storage size, etc. can be very useful for users. For example, they can know how much they will be charged for the transaction. They can know whether the gas limits will be exceeded, and this means they will be able to avoid being charged for transactions that do not complete due to gas.

Now, most platforms do have simulation and testing facilities, but of course this will only give you the result for a particular input: it will prove that a particular input does not meet the specifications. But you could try a number of cases and then some other case could have a very large consumption and you may not catch it. So, in that sense, we would like to obtain guaranteed bounds statically, or through some combination of static and possibly dynamic -hopefully static- methods. So, basically only formal methods give us an answer for this and basically abstract interpretation or other proof systems can prove that a program does comply with specifications for an infinite set of possible inputs. So, we will do that: we will try in particular abstract interpretation.

Now, what are the challenges in this context? Well, we are addressing one challenge which is the following: there are other tools that infer gas costs, for example GASTAP, GASOL, and MadMax, but they tend to be platform- or language-specific. On the other hand, smart contract platforms change very rapidly. There are changes in the cost model, and there are changes in the language and in the semantics. We have developed an approach called Parametric Resource Analysis and a set of tools that implement this approach within CiaoPP, and we have used them for the development of resource analyses in a rapid way in many contexts, including resources like time, memory, energy, etc., and for different kinds of input languages.

So we believe that applying this method in this very changing context would be a good idea, and this is what we do in this case study that we are presenting: study applicability in this context with a concrete case study, the Tezos blockchain platform and its Michelson language, of our Parametric Resource Analysis approach. So, the first part of the talk will be devoted to this parametric resource analysis approach. The objective of resource analysis is to statically and automatically infer and verify upper and lower bounds on the usage that a program makes of resources. This can be execution steps, data sizes, time, memory -the classic ones. But also user-oriented or application- oriented resources like bits sent or received over a socket, SMSs, database accesses, procedure calls, files left open, money and gas spent, … these are application-dependent and user- defined. Now, resources in general can have different characteristics: they can be platform-dependent or independent, they can be cumulative or non-cumulative, they could be we could be interested in inferring actual bounds with concrete constants or asymptotic bounds.

Applications often use the actual bounds or need the actual bounds, … And they have also many applications, resources. For example, performance debugging, performance verification, certification of performance, security, absence of side channels, identification or finding attack targets, resource oriented optimization, resource driven optimization, granularity control in parallelism, which was our first motivation for all this resources work, and of course contract gas. Basically it is an area where much progress has been made since Wegbreit's landmark paper, we were the first complete analyzers in the 90s, I have put some of our references there, and then there is a lot of recent work in in this, and we have plenty of talks in SAS and invited talks even on the topic.

So, some characteristics and challenges of resource analysis. This is a simple program, a filter in C. An interesting thing here is that it has an input value, ELEMENTS, and then a loop that looks over the elements and then some if-then-elses. So, if we look at this the cost of this program depends on the value of ELEMENTS. There is no way to produce a concrete number, I mean, out of this, so we what we want t o do is infer functions, so what is the cost, as a function, of some characteristics of the input, which is what we call the metric, or the size of the inputs.

In this case it would be the size of ELEMENTS which in this case is a number, but it could be an array, or it could be anything else. This is a bit of a difference with the Worst Case Execution Time analysis and related methods. Also, this sort of resource consumption analysis is undecidable in general, of course, so we have to resort to approximations, and we like safe approximations, so we would like to infer upper and lower bounds, and that as accurately as possible. So what is the solution for that? It is abstract interpretation of course. So, if we do the analysis of this program what we want to obtain is a pragma like that one: a true assertion, that tells us that the energy consumption is between this and that, between this lower bound and that upper bound, in this case a function. There are two lines and our energy consumption is between those two lines. Great, now, the other important aspect of our resource analysis is parametricity.

Parametricity means that the programmer can define the resource analysis. In our approach, if the programmer defines the resource consumption of the basic elements: the instructions, the byte codes, the libraries, the macros, of the language, this is what we call the cost model, then the system infers the resource bound functions, for the rest of the program, by integrating the cost of the primitives. And for us this resource bound functions take the form of call pattern – research usage function pairs in the sense that, for a certain call, and in fact a certain path, to a point in the program, we get a certain cost function call.

We'll talk later about accumulated cost, which is a bit different. Anyway, this is done via these trust assertions that we see in the figure, you know, where we say that for, I mean, the description of the cost of each individual instruction is done with this trust assertion. So we say that for a given operation we define a cost, we define what approximation we are giving, if it is an upper bound or a lower bound, and so on. We give the name of the resource that we are talking about and then we give an expression that can be linear, exponential, logarithmic, and so on, with constants and so on, that describes the cost of that operation and that is a function of the arguments to that operation.

Here you see an example for a very simple operation like append where we see that the cost of append in the resource steps, that we defined, depends on the length of the input list and the value is actually length of X plus 1. So this is how we describe the parametricity, how to describe the cost of a primitive operation, or of any operation by the way, in this case.

For example, if we have this naive reverse example, and we are doing our steps, then we would infer things like this, like this statement here. Now these are true assertions because they have been inferred, i.e., they have been proved correct by the by the analyzer. And these statements say for example things about the number of steps, which is the resource that we defined of the program. You are saying that, for example, the append_1 predicate that we have there, in steps, basically would do the same number of steps as the length of the list, and that reverse, you know we do this is naive reverse, it is quadratic, so it would do exactly one half L square plus three halves of L plus 1, where L is the length of the input list.

Furthermore, we also infer the sizes of arguments because these are instrumental to be able to go to the next step in the program to, you know, since in every part of the program the cost since in every part of the program the cost depends on the size of inputs, and you have to have the outputs from the previous parts of the program, to compute the cost, it is very important to infer output sizes.

So, here in this case we are inferring that the output from reverse is the output list Y, and this is also a list of the same length as the input. Or, for example, that the output from append_1 is a list that is one longer than the input list. This is a table that gives examples of resources that can be defined with our parametric resources approach. There is a bunch of resources of different kinds, from insertions and deletions, disk movements, arithmetic operations, bytes read or written, and so on. As you can see there are all kinds of different functions, you know, linear, quadratic, polynomial in general, exponential, et cetera. These are examples from java programs, by analyzing the byte code. Again, you see many application-oriented, like bandwidth required, or SMSs, for example, for a phone code, energy consumption in this case (I will talk about that in a second). So these are again different cost functions, resources, size metrics, types of loops, and so on.

So, a brief pass over how we do the static resource analysis in CiaoPP. Well, the first step is that we have to perform all the supporting analyses. I mean, this is, e.g., the sized types and shapes analysis. This is necessary for knowing your size metrics, i.e., how you can measure the size or what are the relevant sizes of your data structures. Typically you also have to do some kind of sharing or aliasing analysis if you have pointers and so on, for correctness, to ensure that you don't have aliasing, or if you have it, and so on. You probably must do an exception analysis, or non-failure analysis in the case of logic languages, and this is typically necessary to infer non-trivial lower bounds, because you know that you're not going to stop the execution, that you're actually going to progress.

And determinacy and mutual exclusion: elimination of determinacy always helps of course to obtain tighter bounds. So all these are intervening analysis or previous analysis that you have to do. Then, once you have all that information, you set up recurrence equations, first on the sizes. In this approach the first thing is to solve the sizes. So, you set up recurrence equations for the size of each relevant output argument as a function of input data sizes, which ixs maybe non-deterministic. The size metrics as I said are derived from the inferred shape information, or types. Data dependency graphs determine the relative sizes of variable contents, and with that we compute solutions to these recursive equations by bounding them, depending on whether we are doing upper or lower bounds, to obtain output argument sizes of procedure calls or built-ins, etc. as a function of input argument sizes, and then using the recurrence solver, this could be the internal recurrence solver or we have interfaces with Mathematica, with Parma, with PUBS, with Matlab, and so on, plus other techniques like ranking functions or polynomial techniques, etc., then we infer the solutions to these recurrence equations.

Then you get things like this, for example, proves that the size of the list that is produced by concat is the sum of the those of the two input lists. Then, we repeat the process for the actual cost for the resources. the actual cost for the resources. We use these size functions to set up recurrence equations representing the computational cost of each resource in each block of the program, and we compute again bounds of the solutions with the same techniques. And with that, we obtain the resource usage functions. Now, what techniques do we use to do the intervening analyses first? Well, for that we use our standard analysis framework, the CiaoPP generic framework, which basically is an efficient, incremental, abstract OLDT resolution algorithm, for Constraint Horn Clauses. It is basically what we call the "top-down" algorithm: the original top-down algorithm, that we original top-down algorithm, that we proposed in the late 80s and is still in use. It maintains basically a call-answer table with dependencies, which represents specializations: for each call you have a corresponding success.

It is multi-variant in the sense that you have different calls and different successes for each predicate, or for each program point. It also has the paths and so on. Anyway, so it keeps dependencies Anyway, so it keeps dependencies through these tables. Characteristics. Precision: so it is context- and path-sensitive, and it has multivariance, but it does it in efficient ways, through tabling and so on.

A lot of techniques for efficiency: memoization, dependency tracking, detection of strongly- connected components, etc., base cases, … connected components, etc., base cases, … All these things are there in the system. Genericity: it's really a generic system in the sense that the domains are plugins. It is configurable in many ways. Of course you can plug in the widenings and it has several places: the call widening and success widening, and so on, that you can that you can plug in to. It handles mutually recursive methods, library calls, externals, it can be guided with assertions, so if in any point you want to improve the information you want to improve the information you can talk to the analyzers through assertions and get them to compute a tighter fixed point.

Then it has two interesting characteristics that I hope to be able to cover at the end: it is modular, i.e., it treats modules separately and with that it achieves a reduction of the working set, and incremental, which means working set, and incremental, which means that you can remember past analyses and reuse information so that only the minimum amount of work is done. Now, another detail about the resource analysis: we have incorporated the resource analysis into the abstract interpretation engine so that the analysis itself, the resources, is a domain. It is an abstract domain of piece-wise functions basically, so it is now one of the PLAI plugins.

This gives us all the advantages of being inside the fixpoint, let's say, which is the multi-variance, the easier combination with other domains of course, maintenance of versions and so on, easier integration with static debugging and verification, with run-time checking, … it just fits into the whole model, and a lot of engineering advantages. Another detail of the system is that it uses sized types, so when we infer shapes, we also annotate the shapes with the sizes, so that every point will have upper and lower bounds. So, for example if we talk about a list, to give a simple example, it will be a list whose length is between this lower bound and upper bound, of elements, each of which is within this bound and that bound.

An important characteristic of the system is that we support different languages and compilation levels, and this is done by program transformation. So, the idea is to have an internal, intermediate sorry, representation in Horn clauses. This process can also be seen as abstract compilation. The idea is that the Horn clauses capture the semantics of the original program. This is a very popular method nowadays, right? So, the source is the program, the target is a set of Horn clauses, and in practice there is, you know, there is the big-step semantics, the small-step semantics, and so on, but the method that works best most of the time for us is that in the end a block in the control flow graph is a Horn clause, and we'll see examples of that.

Now, to get to that point, to get from the original program to the Horn-clause representation, this can be done by partial evaluation using the Futamura projections and so on, and for that we can also use the CiaoPP partial evaluator, or Michael Leuschel's one, etc., or John Gallagher's, or it can also be done directly: you can write a compiler directly. We use that for all the analyses: class hierarchy analysis, shape/types, data sizes, and resources. So, this is going to be a part of our parametric resource approach: the translation into Constraint Horn Clause intermediate representation. This is a very simple example of this process: we transform here an XCore binary program in assembly and we basically go through our process which is: first we recognize the blocks, and the connections between the blocks.

We make the translation to Constraint Horn Clauses. Here you see how each clause corresponds to a block. and sides of if-then-elses or case statements or conditionals and so on, end up being clauses of the same predicate. Then, these are combined with a model, a cost model, which is a set of assertions. Remember the parametricity: these are the assertions that describe the primitives. In this case we describe the individual instructions. As you can see thy are of the of the assembly language. And for each of those instructions we give upper and lower bounds of the energy consumption. These are real energy values, checked against hardware and so on. And with that, if we return to our filter program, we can see, well we have the true that this allows us to infer that true information, the green information, i.e., it allows us to infer that true information directly, but also we can do verification in the sense that if we have an assertion like the blue one, the have an assertion like the blue one, the check assertion, which is giving us a specification in which it says that the energy consumption has to be within this limit, i.e., within these values of ELEMENTS, then we can compare that against the inferred information and get results like this where we identify -the system identifies- that for a certain interval of input values, of values of elements, between 1 and 120 in particular, the assertion, the specification is proved correct, is checked.

And for other values it is false. And in fact, there could be also ranges where we don't know if it's true or false. With this we finished with the first part, with the review of the Parametric Resource Analysis approach, and we tackle the second part, the case study: an analyzer for Michelsin gas and storage. So, in our problem here the source will be a Michelson program, or some higher level source that has been compiled to Michelson, and the output will be gas and storage consumption bounds for entry points and internal blocks, as functions of the entry parameters of the contract. And for the basic steps we will be following the approach: we will develop first an interpreter for Michelson, and for this we will translate directly the semantics of the instruction definitions in the Michelson specification. We will develop a compiler of Michelson into CiaoPP's constraint Horn-clause intermediate representation, based on a partial evaluation of the interpreter.

We will develop a gas model for Michelson's instructions, and we will feed it all to the CiaoPP analyzer. Now we start with a quick review of the Michelson language. It is the native language used by the Tezos platform. It is interpreted, it is strongly- typed, and it is stack-based. Despite being a bit low level, it still uses some high-level data structures like lists, sets, maps, big-ints, and others. A Michelson contract has three sections. We can see an example: the first section is the parameter section which gives us the type of the input argument, the next one is the storage section which gives us the type of the storage, the third one is the code section, which has the sequence of instructions to be executed by the interpreter. And the interpreter is a pure function that receives a stack and returns another stack, without altering its environment. The input stack basically contains only a pair (parameter, storage) which is the initial value of the parameter and the initial value of the storage, and then the output stack is also basically a pair that has a list of blockchain operations and the new storage produced after the execution of the contract.

The Michelson instructions can also be seen as pure functions receiving an input stack and returning a result stack. The table shows the semantics of the Michelson instructions used in the previous example of reversing a list. As an example we see that NIL inserts an empty list on top of the stack. Now, we have to provide the type of the elements that we will be pushing. SWAP exchanges the top two elements of the stack. And as an example of a more complex instruction ITER traverses the elements of a list, performing an action indicated by its arguments, which can be a macro or a sequence of instructions. There are other instructions that receive code as an argument. For example, control structures in the language such as ifs or loops: these are instructions that receive one or two blocks of code. Other instructions receive other kinds of arguments such as NIL which receives the type of the list to be built or PUSH which receives the type of the value of the element placed on the stack. As a concrete example a call to this contract with a list of numbers from 1 to 3 as parameter would present the following input and output stacks, as you can see here.

Note that, as the first instruction CAR in the contract discards the storage, the value of the storage at the beginning is irrelevant to obtain the result of the computation. The first step then is to translate Michelson into Constraint Horn Clauses. Here we see the same table as before, defining some Michelson instructions, and we have the equivalent representation as Herbrand terms in the Horn-clause representation. If you look, for example, at the SWAP instruction, you can see that the input stack and the output stack are represented by lists, and you see how the elements are swapped. You can also see how the first element is taken in the CAR instruction, etc. And the ITER instruction is implemented through a recursive loop in the Horn clause program. So, as we have said, we implement the semantics as a big step recursive interpreter: a direct transliteration of the semantics into Constraint Horn Clauses, where the data structures are represented with Herbrand terms. And here you can see a sketch of the interpreter. run/3 takes the input program and the initial stack and reduces it by executing the sequence of Michelson instructions to obtain the resulting stack.

And ins/3 is the instruction dispatcher, which connects each instruction term with its CHC definition, as we saw in the previous table. As preliminary transformations we introduce labeled blocks for sequences of instructions in the program, to help in later steps of the partial evaluation, and we consider them simply as new predicate definitions. We also rely on a simple implementation within the system, of the Michelson type checking, and this makes knowing the type of the stack at each program point a decidable problem. This allows us to specialize the polymorphic instructions depending on the type of the passed arguments, and it is particularly useful to specify, as we will see later, the semantics and cost of each instruction variant, which can vary depending on those static types. We have here an example of the ADD instruction, which is translated into one of seven possible primitive operations, depending on the type of the operands. Now, based on our interpreter we derive stepwise a simple translator, which combines a handwritten specializer for the run/3 predicate, the stack deforestation pass: including each stack element instead of the stack itself as predicate arguments, and a generic partial evaluation for the primitive structure definitions.

For example, we evaluate conditions, arithmetic instructions, etc., etc. The Michelson control-flow instructions receive both the control condition and the code to execute as inputs. The code arguments are bound to new constants, representing code blocks, and dispatch from ins/3. For each call, partial evaluation will unfold the if instruction and generate new instances, as we can see in the example. The stack deforestation step is especially useful in the output of control-flow instructions, which receive an argument for each element in the input and the output stack, as you can also see in the figure. This transformation is possible thanks to the Michelson semantics which forbids changes to the type of the stack in loops, and forces the type of both output stacks in branch instructions to match. To capture the actual cost of instructions, while using techniques mentioned here, we introduce an extension of the instruction definitions using cost markers. Partial evaluation will replace each of the primitive operations by its Constraint Horn Clause definition and the cost markers will be preserved to keep a record of the consumed resources at each step.

As a result of the transformation, some Michelson instructions, that simply modify or access the stack, will not even be represented in the output, only their cost markers if they are relevant. In this example, an additional transformation is performed: two different predicates are generated for each Miechelson entry point to the contract, and as a result we obtain a CHC representation of the contract that has two predicates which can be analyzed independently. Apart from this one, as we can see there is a one-to-one correspondence between both programs. The next stage in our approach is to obtain the cost model. There are several ways to obtain such a cost model: we can use an already defined cost model provided by the platform, or we can extract it from the source code of the platform by program slicing using a static analysis tool, or even manual extraction. Now, the Tezos-Michelson gas model has several components.

In CiaoPP's terms this is what we call a compound resource. Our cost model must be able to express gas as a function that depends on other Tezos resources. In particular, gas depends on allocations, steps, reads, writes, bytes read, and bytes written, with the expression that is shown. Here we can see the model, expressed with Ciao assertions, actually for two versions of the cost model of the Tezos platform: Carthage, which was the previous one, and Delphi, which is the current one. As you can see, the assertions, the Ciao assertions, are just a transliteration of the model as it is defined. Another thing that is important in the in the cost model definition is to define other details like the sizes of the output in terms of the input for every operation, for every instruction. Once we have a cost model and the representation of the contract in Constraint Horn Clauses, we can simply use CiaoPP to perform the resource analysis and all the auxiliary analyses it requires, which are shown there in the figure.

Here you can see a simple Michelson contract that performs some arithmetic operations, and here you can see the Constraint Horn Clause representation. You can see again that there is a one-to-one correspondence between the instructions in the Michelson contract and the instructions in the Constraint Horn Clauses. And you can also see the cost markers, etc. This is the cost information produced by CiaoPP for the example program. As you can see a correct upper bound is inferred for the size of the output argument, and also, thanks to the cost model we developed, an upper bound on gas consumption is inferred. It is actually log A squared plus 2 B, where A is the parameter and B is the storage. We provide in the paper some performance results, some of which we reflect here. We have tried to cover a reasonable range of Michelson data structures and control-flow instructions, as well as different cost functions, using different metrics. Many optimizations and improvements are possible, as well as much more comprehensive benchmarking, but we believe that the results shown suggest that relevant bounds can be obtained in reasonable times, and, given the relative simplicity of the development, that we argue supports our expectations regarding the advantages of taking our approach.

Regarding scalability, well, since the resource analysis is an abstract domain, it inherits all the characteristics of the general analyzer. So, this includes efficient multivariance, dependency tracking, multiple specialization, and many provisions for scalability that we have not been able to cover yet. I will try to do a very brief pass over a few of them. Some things like fine-grained incremental and modular analysis, user guidance through assertions, adaptable widenings, and other characteristics that are inherited from the CiaoPP framework. Verification, static debugging, abstraction carrying code: all these come for free. There are other characteristics of our resource analysis that are also, can also be very useful in this context, and also come with the framework, especially accumulated cost and static profiling.

So, to be able to know not just what is the cost of a call to a to a function, but also how much of the total cost of an application accumulates in a function. This is what we have called accumulated cost. Also, verification of admissible overheads: to have specifications on changes in overheads. For example if you introduce run-time tests, and so on. Anyway, to conclude we argue that our parametric analysis and our translation to Constraint Horn Clauses approaches offer a quick development path for cost analyzers in general, and for smart contracts in particular.

We have applied it to the Tezos platform, and its Michelson language, as test cases. The tool developed infers statically the desired gas and storage cost functions. The results are correct and reasonably accurate, and running times are quite good. And, our flexible approach we believe is very useful in environments that are have rapid change, environments that are have rapid change, and blockchain technology is one of them, because you have new languages, new cost models, and so on. In fact, as I mentioned in passing, while preparing the talk a new protocol was released by Tezos, and for us, we just had to update the cost model, something that was done in hours. The time went basically in understanding what the changes were, not in encoding them, and we just had to modify cost assertions without having to touch a line of the code.

We would like to thank Vincent Botbol, Mehdi Bouaziz, and Raphael Cauderlier, from Nomadic Labs, and Patrick Cousot for their very useful comments. And I don't want to finish without mentioning the whole team. I mentioned the co-authors at the beginning, but here you see all the members of our team at IMDEA now, and also some previous members. And, with that, I am happy to take questions.