Ethereum ÐΞVcon-0: Solidity, Vision and Roadmap

hello again everybody welcome to day two of DEFCON zero I'm gonna start by talking a little bit allow solidity the new contract language and then Christian Christian Christian is going to continue with the implementation a lot more sort of details and roadmap for the future of it so I'm just going to basically go into what drove the original so our main issue is that we need people lots of people to write contract and as as allaha as often it is brought up these contracts need to be correct because once they're on the blockchain unless you've got some sort of update mechanism in place but that brings along a whole slew of other questions like how do you trust the updates are going to be so these are the two main things that that were concerned with lots of people are able to write contracts and that these contracts are correct so one of the obvious ways of getting more people to write contracts is to allow them to write it in a language that they're already reasonably familiar with so since a lot of our prospective developers are going to be people who already understand web technologies HTML Javascript CSS it makes some sense to have the contract language be syntactically similar to one of these for JavaScript leverage the existing skill set as its there we also need to a language as only as good as the concepts that can express easily so it makes sense to build in the concepts that we know we're going to need to use often a contract language it makes sense to have contracts as a first-class citizen in that language so what we do is we take the notion of contracts and we take a paradigm that already exists object-oriented programming and if you think about it contracts can very easily be considered classes or objects depending on whether you look upon that the instance of the contracts or the contracts the prototype of the contract and so we base the language around this notion similarly within ethereal contracts off storage now having to store things in terms of literals is almost like having a beautiful high-level language and having to store your integers as bytes it doesn't really make sense so what we do is we say right well we are going to support the storage what we're going to do so in air with an abstraction layer and actually what the programmer sees is just what they're used to just to say useable types strings structures integers of various sizes billions and then finally contracts do a lot of i/o and whereas in normal object-oriented languages you may have ways of passing parameters as in parameters and out parameters C++ for instance uses pointers or references to do that if you've got there are language idioms that allow you to have multiple outputs sort of as part of it again a first-class portion of a language I believe go is one of these Jeff will probably be able to confirm that yeah and so this this makes a lot of sense within contracts where we are going to want to call things in and then take data out of the contracts without having to worry about things like references so then we come on to making things you know how do we make things guys how do you make it most likely that these contracts are going to be correct well the most obvious way is unit testing all right but unit testing at least would predict sort of basic unit testing frameworks are going to require efforts at least the small amount of effort from the programmer to put the unit tests in so what we do is we devise the language or at least the language development environment combination to make unit tests actually easier to have in than not so we do things like already have a space where the unit test goes and then the ability just to press enter and you get another unit there so just make it incredibly easy to place the expected results or a set of expected results in there the second thing that really helps us guarantee that a contract is operating according to specification for some simpler specification than the definition of the contract in encode is this notion of formal proofing so who is familiar with formal proofing anyone a little bit okay so formal proofing is the notion that you can you can describe you can specify the you can specify some expectation of a program and that this expectation can neither be proved or not proved but it that happens at the point of compilation not at the point of running the program so you can specify for instance that you expect no matter what state the program gets into you can expect all the balances will add up to a particular number and what you can do is if your program is written using a particular non turing-complete but not a particular language is still relatively rich subset of it you can use this that while the compiler can use this at compile time in order to guarantee that that statement always remains correct always no matter what as long one as long as you only use the methods but of course there's no other way in a theorem done to use the methods and that that will always remain true this is used in existing I think it's used in a lot of safety critical environments and new killer power plants I believe they use techniques like this the rolls-royce the big engines that you see on the side of airliners the reverse thrust mechanism and this was also uses a lot of the microcontroller for that also uses formal proofing I know that because my old university helped develop that so it's in use already but it's not it's not widely used and so it makes some sense for contracts which you know cannot be changed once they're in there to use and then the final final thing is the documentation documentation is you know absolutely fundamental to our security model when the decentralized application wants to make a transaction with the blockchain once the sign something with the user's key it's really really important that the user can read a statement as to what is going on with their key because if they can't do that will you know we'll back to step one this is absolutely fundamental to our this notion of having a sort of more trustworthy interface to massively multi-user applications and so we we build the documented documentation system actually into the language now this is you could argue that doxygen Javadoc already sort of have something similar and you know they do but what we're going to do is we're going to build on that and we're going to have rather than just placeholders actually have evaluatable expressions within the documentation itself so contractors first-class citizens here's a here's an example can I stand in any one place that will allow everybody to see most people so you've got the idea you know people who code a bit in C++ will probably this office you can void is long to read the function so it looks a little bit like a class as you might expect you got to set and get you know you pass the parameter into one this should read function get returns the other the other side so super easy oh so if we if we move on to the storage facilities again it should be relatively easy to see that we have this notion of a state right which is also filled the members of a class in this case it's a human and then we can define our set is just you know that again and I'll get those returning so this this to a C++ programmer will be super super simple but to a a jar a JavaScript programmer as well this portion of it will also be reasonably familiar the returns bit is a bit different but and the fact it's tight but can't really get around it now the storage facilities extend so what we also are able to do is map and this is a fundamental construct in this language we can provide mapping from a from an original domain range range from an original range in this case the range of addresses so this is the type of address which is a 160 bit integer into some arbitrary domain in this case endpoints and with with specified an endpoint as being I think this is a bit see the syntax is still up for alteration if you if anybody here has sorry to make it slightly more JavaScript Z without making it too difficult to pass and then we have the idea of setting we can make a new one assign it super easy stuff the leave for killing it very edit returns so if you want to get the information of an address we can specify D to return times you know now the formal proofing is probably the most interesting and challenging part of this I imagine Christians looking forward to that particular portion the note the the idea behind it is that we have here's an example of a cash contract so you give to somebody some amount and you check that the amounts greater in the Corliss balance then you subtract it to the destination check the balance of these you know we could I do a mapping to map our address to a to a numeric type and that's the balances again easy enough we got this construction so this is the initializer so we're just going to initialize the whoever created the contract to you know a billion or whatever some pre mind amount and then we have an invariant right this is this is the really clever bit so this is the bet that says at all times no matter what this contract must always satisfy this piece of sort of prototypical set theoretic that's what we're actually doing is we reduce the well we're gonna we're gonna map the idea of value of so value of just in this case is the idea of taking each item in the balances and just taking the value so the balance itself rather than be address mat sponsored you just a set of addresses and we're going to reduce that set of addresses with the plus operator so it's to say we're going to sum somebody and what we're saying is that should that certain sum of the balances should always equal a billion and what the compiler is able to do is go through the only non Const the only mutable method so the only method actually changes the state which is this one and do a formal proof of this code to check that if it's if it satisfies this constraint at the beginning and this code executes it will always satisfy the constraints at the end in this particular case it does and so the compiler would be happy now what that gets the the the author is the ability to state in the documentation that this for this line of code it says at all times total balances will be a billion now someone coming to this this contract – who maybe have some reputation it could be likely a favorable foundation or the event will be able to check that that that line of code indeed is a well representative of the documentation line above it tick yeah that's valid and then they don't need to check anything else because the compiler does it for them so when the user comes along and they say who the foundation made this the foundation ticked this line that saves cash at all times in this contract is a billion they can trust it they're like great and in none of that did we have to actually look at this code I'll compare the end of it so the final thing is the in job Docs so that's kind of what I was saying earlier we have the doxygen style at the moment triple triple slash which I guess most of you are probably familiar with the interesting thing of this is the backtick so the backtick allows us to insert arbitrary expressions JavaScript expressions into the documentation so what this does is it provides a notice for the user so when the user sort of you know clicks through on some data gets a transaction that that wants to submit the transaction into the network this is the message that comes up in the browser before it signs it so there's like a sort of black box in the browser right that contains all the secret keys and ultimately this could be controlled actually not in the browser itself but it could become salt by a trusted computing portion of the hard way so something from which even the operating system doesn't have access and what this this this would bring up a message it would say transfers and then it wouldn't actually save them at the score amount right it would say the actual amount it will place the amount in there so it evaluate what the scroll amount is and then cash from your account message of core give you either the name if you've gotten a name attached to it for name rich it's got a name that you trust or just the hex or maybe even a little graphic to the economic control by and then the same thing but but the destination returns cash don't really care about and then you've got the is endowed with the balance of the billion it will probably and then finally this magic line total pattern the system is always a billion and this is the thing that the the auditor or whoever whoever is that there are many auditors I mean you know it's decentralized check this line and that's the magic when the user comes to look at it they get an absolutely trustable statement cool so those are the basic driving factors and rough solutions to them and Christian is now going to go into a little more detail then we'll do our question and answer session at the end okay so yeah now let's see how this is all implemented first what were the requirements personality Gavin already told us about that so it's a contract oriented language which means that classes are contracts it is statically typed from the syntax close the JavaScript and when you are types to JavaScript you get something which is close to C or C++ the built in language documentation and some language subset that allows formal proofs of correctness yes coincidentally this is exactly the contract Kevin also showed so you already know that then the ubiquitous coin contract there we see something we've not seen in Gavin's talk the language aesthetically type but still it's not necessary to always specify the types you have this keyword var which can be used for local variables and it's similar to C++ Elevens auto keyword which means the type is just taken from the type of the first assignment to this variable so I think that's quite convenient okay then we see something users can define their own type so they can define struct and then mappings can map basic types to any type so we can have a struct at the as the value type we can also have mappings again as the value types we can all serve contracts as value types and this vote is the the struct type here so you can retrieve from the storage from this this mapping here the value which is stored at the senders address and then ya assign values to this local variable but this doesn't release this actually assigns it directly to storage because vote is not a local variable that is stored on the stack it's just the reference to the storage so you don't you can ya directly access storage here okay some teachers about the combination process it consists of six stages the first stage is just parsing after parsing we have the basic abstract syntax trees so we know the structure of the program and the next step is resolving identifiers so words identifiers strings can mean can refer to usually fine type names that can refer to function names whatever and these are all resolved in their respective scope and assigned in the abstraction to extreme and after that SAP it's possible to infer and check the type so that assignments do not conflict in types and functions are called with the correct number of arguments and and so on and after this third step it's possible for an IDE to actually use the abstract syntax tree for things like Auto completion or checking where a variable is used and all this nice stuff we want to have and the fourth step is compilation to who in assembly language then this assembly language is optimized and finally compared to bytecode some details about the type system statically type means that all expressions in this language have a fixed type that is known at compile time and an operation so it's not possible for an operation to involve two different types so you cannot add a string to a number but it is possible to do manual type conversions so if you manually explicitly convert the integer to a string then you can add them and at some points automatic type conversions are done if it makes sense and if no information is lost so we will see that later the types that are currently implemented are unsigned integer signed integers and hashes of various sizes another so this six is always the size and bits from eight bits to 256 bits in 8-bit steps and the difference between integers and hashes is that arithmetic operations are not allowed flashes because it doesn't make sense and a special case of 160 bit hash is address where so and for an address not even bit operations are allowed but instead you can yeah one way you can send ether to that address or query the balance at that address then we have boolean type mapping structs and contracts later we'll also have strings and but that's not implemented yet okay I hope you've seen this it's the solidity compiler compiler on script to be run offline in a browser okay so this is a basic contract without any content and on the right hand side we see the compiled op codes the compiled binary or hex version of the binary more detailed assembly and at the bottom of the abstract syntax tree and there's something in the function we will see the types that are inferred from these literals so it's in so the type of litter assal weighs the smallest type where it fits in this case it's an 8-bit unsigned integer and if we add an 8-bit unsigned integer to an 8-bit unsigned integer we again get an 8-bit unset integer nothing too fancy so it gets interesting when we use larger well years seven thousands and 16-bit eyelets at integer and we add an 8-bit 16-bit antonsen unsigned integer this 8-bit integer is implicitly converted to a 16-bit integer and so the type system ignores overflow so it's just about yeah so one we always know that the type of the result of a binary operation is one of the types of the operands so only one of the operands is converted we will see that X will have time 16-bit unsigned integer okay what happens if we if we want X to be an 8-bit integer we get the error that the 16-bit integer then it's on the right-hand side is not converted to an integer okay something that is perhaps also interesting is what happens if we add a negative number to a positive number so the compiler complains that you cannot find a common type and so the type of minus one is signed bit integer of eight bits and eight is an unsigned integer of eight bits and so you cannot convert one and to the other so there's no common type of course we know that both of them can be converted to an assigned integer of sixteen bits without losing any information but because of this restriction I told this is not on automatically but we can of course force that manually and convert this minus one to sixty bit and then it works X is also a signed integer of sixteen bits what do you mean it works on arbitrary expressions okay the optimizer so solidity does not compile to lrl but compiles directly to EVM assembly and uses LLS assembly class which also includes the optimizer so if improve the optimizer then solidity Serpent and lll will benefit from these improvements I already did some improvements of the lln optimizer is so most of the steps in the ll optimizer is just a pattern of opcodes is mapped to another – yeah to another set of opcodes and if the number of of course decreases then this is applied one of these patterns is constant folding so if you write seven plus eight then the optimizer will not do this addition at runtime but it will do it at compile time and just push directly push 15 and I added some other operands for this constant folding because they were more widely used in solidity and another optimization optimization is pop optimization this happens if you if you do some computation and it does not have side effects and you just discard the result of the of the computation then the optimizer removes the whole computation first an optimizer okay so we have to look at the assembly now from tax zero to this tech three that's the the body of the function and we see it it does push zero that is it initialize it initializes the local variable to zero then it pushes the two operands 8 + 7 X them this swap stores it at the location of X and top removes the old value of x so pushes that zero and then it pushes F which is 15 swaps it stores it and removes the old value so this is not yet optimal because we know it's not necessary to initialize X to zero when we change its value afterwards but yeah optimized has not finished yet to see pop optimization we have to remove this X you also have to remove the local variable so yeah what happens here now is okay yeah take zero that's the function body and nothing happens okay how does acidity to use memory and storage we've already seen everything is Uma initialized we can trust on that and local variables are currently used only on the stack so it doesn't it doesn't actually use memory uses memory only for up codes for operations where it's really necessary this might change later because the EVM can only look at at the stack at the 16 topmost position in the stack so if we have many local variables or very complex expressions and we want to store something in the first local variable we might have to look too deep into the SEC so we have to copy it to memory first and then unwind the stack and save later again the storage is allocated in the contiguous way so if you have one struct and then abstract then they will sit there in it directly next to each other in storage where mappings are an exception because mappings don't have a size so to compute the offset of a mapping you take the or to compute the offset of value in the mapping you take the offset of the map you concatenate the key compute sha 3 of that and that's the the offset of the value I think that's more or less exactly how it's done in serpent with the introduction or in serpent super no obvious way I would say ok reference to other contracts Gavin already gave it gave an example for that a bit more detailed we can include other contract files contract sauce files and yeah I use contracts also as values for mappings or directly in storage and we can so name rec edgar is an address we can so if you write it like that neighbor ik is a contract type we this is just usual type conversion so address can be converted to a contract type and then we store it in storage we can from this point on we can use name Rick and call functions on that and for that we don't need the full definition of the name or contract we only need to know its interface so this this name Rexall file can be a simplified which has all functions but not the function bodies it's different for this line here where we create a new contract and store it at this point in storage so here we actually need the full definition of the point contract to create it and when it's stored we can use it to call a function send amount to so that's yeah that's the function you've seen in the second example yeah what are the plans for the future I think we will have a complete first usable version by the beginning of December so yeah after def con 0 the current state is that most of the basic features are there what is missing is actually access to the blockchain so query balances and send funds and references to other contracts but that's future areas of work are yeah I shall write some tutorials or actually you create a specification of the language I think then depending on feedback from users we might add some new features to the we might need to improve the optimizer but yeah depends on how the language is actually used so we look at some real world example contracts then work on documentation for this net spec documentation where the next we see the next talk and actually creating the the IDE and then work on static analysis and formal verification and of course bug fixes in refactoring my own time or okay so yeah one planned language feature I don't know if that's interesting anonymous structs so you can do mappings from a struct type without actually creating a name for it you can so functions that return multiple values you can just store them in local variables directly and also return them like that so I think that's something we want but that's not implemented yet somebody is for the optimizer I think it's really important to be able to use other contracts as library contacts somehow at least for simple functions like computing the minimum or finding something in a list and for this it's important to so calls to other functions are quite expensive because you have to move all arguments to memory create so run this this call up code and then retrieve them from memory again and if the function is small then it's much more efficient to just copy the code to the to that location and another benefit of inline function costs also that the optimizer can actually optimize something there then it's perhaps necessary to reorganize the local stack so for example if two about two local variables do not really overlap in a function they can use the same stack slot and if the feedback results in really really bad performance for the compiler then it's perhaps necessary to consider some yeah simple intermediate language where more optimizations are possible than on this on the stack base dibianco so some ideas for static analysis constructors are not enforced currently that should be done then what is also quite easy I think without doing the whole process of form of proofing is to determine the value ranges of local variables and check whether some overflows can happen so if some addition was protected by pry or prior to that checking that an overflow can happen and perhaps also checking that from the value range some equality compare comparison or inequality comparison can never be true and then we see that the developer did some error there so some code is not reachable at all what is also useful I think is to estimate the gas usage of us at some source line so that while writing a contract you actually see how expensive the line is and you can perhaps change it to be less expensive and what we probably also want is if you issue a transaction you have to set the gas value and you probably want to have some estimate on how expensive it is some lower and upper bound this of course will in the end so the actual gas cost will in the end depend on the order transactions are introduced into the Box chain so yah will not be absolutely correct but it's it's so we can provide good lower and upper bounds you

You May Also Like