the following content is provided under a Creative Commons license your support will help MIT OpenCourseWare continue to offer high quality educational resources for free to make a donation or to view additional materials from hundreds of MIT courses visit MIT opencourseware at ocw.mit.edu okay let's go ahead and and get started so tad is in California today so he asked me to give today's lecture which will be based on a research project that we're working on at the DCI this is actually really connected to confidential transactions and confidential assets which I think that you guys covered in a previous class so Peterson commitments will play a very important role in the work I'm about to present I want to present this work that we've been doing this is joint work with Willie Vasquez who was an MIT MN he graduated and moved to UT Austin and madara's versa who's a research scientist with the DCI and this project is called ZK ledger it's about privacy-preserving auditing for distributed Ledger's so to start with I'm not sure how much work you guys have done on permissioned blockchain and so far how much have you covered with respect to provision blockchains nothing at all okay maybe we should start with should we start with what is a permission blockchain or do you guys feel like you have a handle on what that means or what that is it's most generic abstract sense yes yeah so um good question okay so you know these words that we use permissioned versus permissionless usually refer to who has write access to the chain so in crypto currencies like Bitcoin and aetherium theoretically any one of us could spin up our own ASIC get really lucky – and put the transactions that we are interested in into the chain now of course no one's going to accept that block if it violates sort of the majority consensus rules of the chain but assuming that it follows the rules it's well formatted it's that it has the appropriate proof of work etc any one of us could submit a block to the Bitcoin blockchain right there are there's no permissions on who is allowed and who isn't allowed to write to the Bitcoin blockchain permissionless blockchain sorry permissioned block chains are a different story these are more like disturbed databases the idea behind permissioned blockchains is that you choose a set of actors ahead of time and only those actors are allowed to write to the blockchain so you can't just have anyone willy-nilly spinning up a mining node or spinning up a node in the system and submitting transactions and things to the blockchain usually permissioned versus permissionless doesn't say anything about reads and who can read the blockchain we're really just talking about rights so an important use there you know permission blockchains i think are not really the focus of this course in part because we sort of take the position that the technology behind them are is essentially distributed databases and there's lovely courses out there on distributed databases and you know there's a lot to learn about them and they're very interesting but they're not necessarily sort of very closely related to cryptocurrencies and sort of the designs of the past decade so so that's the idea behind permission blockchains now as I was saying I think that there's a lot of use cases for permission block chains they're very useful distributed databases in general are very useful and so let's talk about one of them and I'm going to kind of interchangeably use permissioned blockchain and distributed ledger here distributed ledger permission blockchain I'm kind of sort of saying the same thing so let's start with so permission block chains any questions before we kind of dive into the use case and z:k ledger about permission block chains kind of get the idea it's a whole bunch of people working together to construct the ledger so let's start with the structure of the financial system and this is kind of setting the use case for z:k ledger and what we were trying to do and what we were trying to address so to start with there are dozens of very large investment banks that are trading securities currencies commodities derivatives many of these things are traded on exchanges regulated exchanges legs like Nasdaq or the New York Stock Exchange but up to 40% are actually traded in unregulated fashion they're done in a way that's called over-the-counter meaning that there is no regulated exchange sitting in the middle facilitating these trades instead the banks the entities trade amongst themselves in sort of a slightly more ad hoc fashion this represents trillions of dollars of the economy and just to give you a sense of scale sort of the time frequency we're dealing with we're not talking about high-frequency trading that's not the use case that CK ledger is targeting we're thinking like tens of trades a minute these are things that are that are done electronically it's not like people are people might be calling each other up to arrange these trades but oftentimes they're done electronically but we're not considering high-frequency trading so a useful abstraction of for thinking about you know a table of these trades is a transaction ledger so each row in the ledger is a transaction transferring assets from one bank to another so this should be very familiar to everyone it looks a little bit different than things like Bitcoin because since this is a sort of a ledger designed for the existing financial system we know who the entities are they have names they're regulated there they're you know they have companies behind them we're not talking just about arbitrary public keys so each row in the ledger is a transaction so here Citibank is transferring a million dollars so that's another thing that's very different than cryptocurrencies here we have multiple different assets being recorded in the ledger so we're not talking about a ledger with a native asset necessarily this is not a cryptocurrency really think about this like a table in the database recording transactions those transactions might be in dollars or Euros or municipal bonds or securities or shares they could be anything right so we're just kind of recording transactions so Citibank is transferring a million dollars to Goldman Sachs JP Morgan is transferring money to UBS and to Barclays okay so as with crypto currencies I think crypto currencies have kind of inspired people to reconsider the case of a distributed database as a ledger that multiple entities can write to and one of what part of that consideration involves thinking about digital signatures so maybe these banks have of public private key pairs and perhaps they use their keys to indicate consent on the transaction right this is something that's kind of done under the hood right now when you think about the protocols we use to exchange data but it's not necessarily done explicitly when conduct when creating these distributed databases so they can use their there are keys to attach signatures to the transaction to indicate a consent to transfer for security now this ledger might be maintained by a third party or it could be a distribute database of permissioned blockchain that is run by the transacting parties so there are some important financial invariants that you want to maintain here to verify that transactions are happening correctly that you know that the that they're valid that assets aren't created out of nowhere so you know one of them is consent to transfer and we use these signatures for that another important thing to verify another important invariant is that the person doing the transacting actually has the assets to transfer that you know they're not making up assets out of nowhere or they're not double spending that there isn't something incorrect happening and then that assets are neither created nor destroyed that we have conservation of assets so these are important financial invariants to maintain this problem this looks a lot like what you know the invariants are that we're trying to maintain an ax crypto currency like Bitcoin or aetherium except this was with multiple assets and we kind of know who the people are involved now if this ledger were being maintained by an exchange then the exchange would be responsible for kind of verifying and validating that all of these transactions are correct if this ledger is being maintained by the participants it's a permission blockchain then you know everyone can verify and validate that these invariants are being maintained by looking at the blockchain by checking the signatures right so that's great except for the fact that privacy is really really important and quite a bit of data and quite a bit of sensitive information is actually leaked by looking at these transactions so if we want to be able to maintain financial invariants if we want to be able to check and make sure that the ledgers correct and I mean this is sort of the whole point of blockchains is that everyone can sort of verify and validate and reconcile you know real-time then in the act of looking at those transactions you're actually finding out really sensitive information about banks trading strategy about their intellectual property okay for example a large trade might indicate that a bank has decided to get out of a position other banks can learn about this notice it on the blockchain as they're validating transactions and try to follow suit driving down the price so privacy especially in this type of context is really important and I'm sure you can imagine a lot of other different types of contexts in which privacy is important as well right so we don't want to be in this situation where we have to choose between privacy and being able to validate what's going on right so even if we were to use a trusted third party like an exchange we still wouldn't have privacy necessarily because the exchange would be looking at everything in the ledger so that's sort of the goal here that we're trying to achieve we're trying to achieve this goal where we want to be able to publicly verify and validate what's happening in the ledger and at the same time obtain some amount of privacy for transactions we don't want all transactions to be public to whoever is maintaining the ledger be that a third party exchange or a set of banks so I'm just gonna pause right there and and sort of you know do we kind of get this set up here privacy versus verification we want both any questions about that yes yeah that's a great question so I'm you know I I have dollars and euros here right how do you know that Citibank actually has those dollars to transfer to goldman sachs right where does that come from notice how this is transaction number ninety i'm assuming that there's a transaction up here this is just a subset of the ledger but as you'll see in a moment we we think about this role of depositors who are allowed to deposit assets into the ledger and so i'm assuming there's a transaction up here somewhere which in a sort of like provable way shows that someone we trust to deposit assets into the ledger gave citibank the dollars necessary for them to give to goldman sachs but you're absolutely right if the asset isn't native to the ledger then you are you do have that connection to the real world where you do need to apply some level of trust the question is it is it auditable can you at least track it back up to that point any other questions okay great so um so we want to do the following we want to be able to verify that financial invariants are maintained that things are correct while achieving privacy so one good technique for achieving privacy is simply to encrypt or hide the transactions in the ledger okay so surprisingly maybe we actually know how to do this and to at the same time still achieve checking of the financial invariants here are two systems that actually achieve this the first is zero cash which was implemented inside of Z cash an anonymous cryptocurrency in Z cash transactions are sort of very they're completely hidden they're completely opaque shielded transactions in C cash G cash has two trains of transactions one set is public like in Bitcoin the other set it are called shielded transactions and in a shielded transaction you don't know who the parties are who involved are involved you don't know the amount of the asset that's transferred you you can validate that that this was correct that money you know financial Marian's were maintained zero cash and Z cash use a primitive called zero knowledge proof and in particular ZK snarks zero knowledge succinct arguments of knowledge those were actually developed in part here at MIT so so using these sort of like really interesting cryptographic primitives you know we're able to maintain financial invariants in Z cash there's another system called solidus which is out of Cornell which also achieves this you can't tell who the participants are and you you can still maintain financial invariants solidus uses a completely different cryptographic primitive called PV ORM so you should check out what these papers if you're interested it's definitely still possible to do this to do this to accomplish this right however there's still a problem here both of these systems hide everything you get no insight into the economy into the blockchain into what's actually happening in the system regulators in particular need insight into markets in order to maintain financial stability and in order to protect investors even considering non-financial use cases it's oftentimes the case that people who might be using a blockchain or distributed ledger distributed database want to be able to to maintain that certain things are true and to get some insight into the system that they are that they are that they're a part of lack of insight can actually have devastating effects as we saw with toxic assets in 2008 one you know part of the problem one contributor to the crisis was the issue that people didn't really have confidence or insight into exactly what assets banks had on their books it was a bit confusing and surprisingly this happens to this day I think there was a case with with Dow Chemical where they didn't actually know how many outstanding shares that they had issued like they just didn't know and then um they had to they had to give out a dividend and they would they just were like they gave out too much money or too little money or something like that so here are some examples of things that we might want to measure and understand we might want to get a sense of leverage in the system so what you know the participants who are holding assets like how much do they have on mass we might want to understand exposure we might want to get a sense of market concentration you know it's pretty useful to know things like okay this Bank like this asset in particular is very highly concentrated or this asset is more distributed amongst the participants of the system so there are things that we we might want to learn about the economy represented by the assets in the blockchain and that if the entire thing is sort of encrypted or completely opaque we're not going to be able to really gain this insight and toxic assets like I said so here's a specific example of something we might want to learn you know here's a bank here's an auditor and the auditor might want to and let's assume the bank is using a system like ZK ledger provision blockchain and auditor might want to know how exposed is this bank to a drop in the euro so if the you know the euro versus the dollar changes dramatically what's gonna happen to this banks balance sheet so nzk ledger the auditor can ask a question like what fraction of your assets are in euros and the bank might respond three million out of 100 million so you know three million euros 1 million dollars something like that but you know just simply asking this question the auditor doesn't really get any any sort of assurance that this answer is correct and so this is sort of the Nugget of the problem that we are trying to solve and achieve so this talk presents the UK ledger and it's an a private auditable transaction ledger the idea behind ZK ledger is like other systems we also hide the transacting banks and the amounts involved we provide integrity with public verification so despite the fact that you can't really see the details of the transaction anyone not just the participants but anyone you want to sort of show this to can publicly verify and convince themselves that transactions are well-formed and financial very invariants are maintained and an auditor can interactively compute provably correct linear functions over the transactions so things like trying to understand market concentration or leverage or exposure they can compute sort of a set of queries over the contents of the transactions in the ledger okay so I'm going to describe to you how zk legit works I'm going to sort of give you a brief overview of the system model and then I'm gonna start to go into some of these sort of more interesting pieces inside ZK ledger so the commitments that we use to hide things sort of the the way the ledger is constructed and I'm going to tell you a little bit about the primitives that we use which are zero knowledge proof not snarks which I talked about earlier a different type is your knowledge proof but I'll give you sort of a little bit of a flavor of what those things look like and then I'll show you sort of the performance of ZK ledger so you know whether you buy this use case or not I think that you'll see these topics come up over and over again as you're looking at crypto currencies people are quite interested in zero knowledge proof and how they can use them to provide privacy inside of crypto currencies so okay Before we jump into the system model any questions okay let's go so let's start with the system model so here's the system model we have a set of banks these banks have generated public keys and secret keys which they have and these banks are working together to construct a ledger and the way that this ledger works is that they are determining transactions amongst themselves so they're deciding okay you know I'm gonna transfer this amount to that Bank and then once they've sort of decided on a transaction they append that trend action to the ledger and again remember this ledger could be maintained by a third party it could be maintained by the banks themselves in a blockchain they could be using a public bulletin board they could be posting transactions to Twitter it doesn't matter we're not sort of really concerned about that right now the point is that there is a n append-only ledger and that's sort of orthogonal to sort of our techniques how exactly it's maintained an auditor a third party auditor can obtain correct answers to questions based on ledger contents so note that auditing is interactive it's not the case that someone can just take this ledger and compute things over it they can't do that but there's not there's not enough information here to reveal the answers to queries the auditor has to actually talk to the participants in the ledger so you might say well we didn't trust the participants to give the audit or the right answer why would we trust them to answer the auditor at all well kind of in the setting that we've set up for ourselves if they don't answer the auditor they can go to jail okay so at least you know you can tell whether the bank is answering you or not if the bank were to lie to you you might not be able to tell whether or not you know this allows you to be able to if you get an answer you know that that answer is correct so the auditor will query the bank and ask a question similar to what we were asking before what fraction of your assets are neros the bank can respond but this time instead of just getting an answer the auditor also gets a small part of a proof from the bank and can confirm that that proof is correct based on these opaque transaction details in the ledger okay so the auditor still needs to to talk to the bank but using the bank's response they can compute a proof that shows that this answer must be correct okay so auditing is interactive like I said it's possible bank might choose not to respond that could happen also I want to point out that if the auditor is constantly auditing this ledger it's constantly asking how many euros do you have how many euros do you have how many euros do you have then if that number changes it's reasonable to assume that that change happened in the last transaction appended to the ledger right so you know if the if the bank continually answers the auditors questions this will leak transaction contents this doesn't provide what's known as differential privacy okay and so kind of the setup here is that the auditor and the and the bank should come to some agreement about the appropriate level of sort of frequency of auditing such that both the auditor gets the data that they need and the bank maintains the privacy that it needs to maintain okay so as I said before ZK ledger supports any linear function over transaction values so we can do ratios sums averages variants Q's outliers by outliers I mean you know give me all the transactions that have amounts outside of this range so you want all high value transactions you can provably correctly give that approximation so orders of magnitude of my trades without revealing for the precise number changes over time and also well-known financial risk measurements like the her fin doll Hirshman index so in this talk oh sorry yeah there are small amounts of well defined leakage for some of these measurements and we're sort of so for example in order to compute the average we actually compute the sum and the number of transactions so we leak a little bit more information than just the average but but it's very well-defined and in this talk I'm only really going to talk about sums but but our paper which I'll sort of put a pointer to on the website explains how to do more complex things and and everything sort of built on the sums primitive so let's sort of talk about the security goals so what are we trying to achieve with the system so first of all we're trying to achieve privacy we want to make sure that the auditor and non-involved parties cannot see transaction participants or amounts so I don't know if I made this clear before but it's not just the case that this third party auditor can't see who's in transactions or who's you know what they're transacting anyone who's not actually involved in the transact and receiving or sending assets also can't see inside of a transaction completeness so what do we mean by completeness what we mean by that is that banks can't lie to the auditor you know they can't say that they have five million euros when actually they have three million euros but also the bank can't selectively leave out things okay so you could imagine some ways of constructing this in which if a bank was really clever they could just sort of not include some of the transactions that they had received and thus change their answers to the auditor but one of our security goals that we achieve is that it's impossible for a bank to omit transactions when responding to the auditor we also are looking to achieve integrity so banks can't violate financial invariants we assume that banks are malicious but honest banks should always be able to convince the auditor of a correct answer so we don't want to have a situation where a malicious bank screws up the ledger such that an honest bank can't respond to the auditor because the ledger doesn't make sense so we also want to maintain that and then progress we also want to make sure that a malicious bank can't block other banks from transacting that they can't hold up the entire ledger and I think as sort of I described the design it will become clear why these things were issues that we were worried about and it's part of sort of how we design the system to address these problems so the threat model and the the threat model describes sort of what kind of attacks that we are trying to protect against and what kind of attacks we don't protect against so banks might attempt to steal or hide assets manipulate balances or lie to the auditor banks can arbitrarily collude so they can try to work together to do these things and banks or the auditor might try to learn transaction contents so we you know it's not that they're passive they might actually be trying to actively learn what's happening in the ledger what's out of scope for this work is a ledger that emits transactions or is unavailable as I said we're agnostic to the ledger implementation as long as it's available so it accepts transactions it doesn't if the ledger doesn't accept transactions we can't we can't help we can't provide our goals we also don't protect against an adversary who might be watching network traffic so if you know there's an adversary whose adversary was snooping on the network and sees that two banks are communicating a lot it's reasonable to assume that those two banks are involved in all the transactions there are other techniques to deal with these problems and then also what's out of scope is banks leaking their own transactions if a bank week their own transactions we throw up our hands sorry we tried to give you privacy but yeah okay so that's the system model for ZK Ledger's that's the threat model those are our goals that's what we're trying to achieve now let's get into the design a little bit so we're gonna build up the design based on this example so here is an example of a public transaction ledger and we're going to slowly start to mask things in this ledger until we sort of end up with ZK Ledger's design so we have four transaction he four transactions here I'm only using one asset euros you'll notice that the first transaction addresses the question that you asked which is well where do funds come from and here we have sort of a special depositor who is injecting assets into the system this depositor might be the central bank for example a lot of systems are actually designed this way so stellar I think chains blockchain ripple you know these systems are all designed with the idea of a special depositor who's allowed to create their own asset and insert it into the system so here maybe the depositor is the european central bank maybe it's some other Bank who provably has a lot of euros in their account and puts them onto the ledger so there's a depositor who's granting 30 million euros to goldman sachs and then there are three transactions and it's i think it's important to sort of like get a sense of what these transactions are so goldman sachs is transferring ten million to JPMorgan and then JPMorgan in two different transactions is transferring money to Barclays okay so let's sort of move on from here so like I said the depositor injects assets the ledger and that's the way of entering transactions that's the depositor right there now our goals are to achieve auditing and privacy what do we mean by that well we shouldn't look at someone looking at the ledger shouldn't be able to tell these amounts except for the depositor transaction which is always public in z.k laser to the depositor transactions are public but they shouldn't be able to tell from after that that Goldman Sachs gave money to JP Morgan and that JP Morgan gave it to Barclays and what the amounts were okay so in this example we're going to try to provably audit Barclays to figure out how many euros Barclays is holding and again we want to hide the participants amounts and the transaction graph so what is it what do I mean when I say transaction graph I think you guys learned a little bit about a system called confidential transactions a few weeks ago confidential transactions does not hide the transaction graph it hides the amounts but you can tell sort of the source of the funds so here the money that Barclays got so if if we were trying to sort of attain privacy in the system the fact the goldman sachs got euros and goldman sachs is the only company that got euros is public because depositor transactions are public but we shouldn't be able to tell that the euros that Barclays got obviously they must have been sourced through goldman sachs we shouldn't be able to tell they went through anyone or how many people they went through or that it was JP Morgan so if we leaked the transaction graph you might be able to tell that there was an entity that the money went through before it got to Barclays so when we say we want to hide the transaction graph that's what we mean we want to be able to hide sort of the flow of funds how they move through the system okay we hide the amounts by using a cryptographic primitive called Peterson commitments so Peterson commitments as you learned are binding and hiding commitments to a value and the way that they work and I'm terribly sorry but I use slightly different notation than tag so tad uses additive notation and a multiplicative notation and I use exponential notation these mean exactly the same thing I will very quickly sort of like write down word what they are so in taja's notation G and H are generators in some group and if you have some value and some randomness then you would write a Peterson commitment as as that okay now whether it's R times G or V times H isn't important the point is these are two generators of a group in this case don't looked at curve group and in Bitcoin SEC P 256 K 1 it doesn't really matter what those generators are you just pick any two you could call this G 1 and G 2 if you wanted to and so this is the way that that Tad's would write it the way that I write it and I'm sorry about this but it's good that you guys get exposed to both ways because you'll find both ways in the literature is in exponential notation so I would write G to the V H to the R and they're multiplied together instead of added together ok so this is just a purely a notational thing it's just instead of the multiply operator here we're doing exponentiation instead of the plus operator here we're doing multiplication so Peterson commitment the way that you commit to a value V is everyone has chosen these generators the generator choice is system-wide this isn't something you pick at the time you make a transaction the generator choice the system-wide however the are the randomness you do choose fresh each time you pick a new randomness each time and the way that you commit to a value is by producing this value right here which is the generator raised to the value times the second generator raised to the randomness and what this serves to do is it's binding it's very difficult to come up with later a different V and a different R that ends up equaling the same commitment and part of that is based on an assumption called the discrete logarithm the hardness of the discrete logarithm problem and also importantly these things can be homomorphic ly combined so if we have these commitments to these different values and we multiply them together well when you multiply things that are raised to an exponent you end up adding the exponent right so by multiplying these things together we end up with a valid commitment to the sum of the values in the individual commitments that we multiplied together so we can homomorphic ly combine the commitments it ends up summing in the expo and the exponent and we end up with a valid commitment to to the sum of the values another nice property of Peterson commitments is that they are very fast to create combined and verify and perhaps surprisingly we can actually achieve all of the auditing functions that I showed you earlier just using Peterson commitments so before we move on questions on Peterson commitments I know that you guys learned about this in a previous class but they're pretty important they're used in a sort of I mean they're used in confidential transactions as well to prove that the sum of the outputs is less than or equal to the sum of the inputs if you guys recall that so any questions ah great question you're making this commitment at the time that the transaction is created and I'll get into the details of who creates the transaction and and what do they choose and how does that happen and how do we make sure they don't screw it up well so measurements happen on top of the ledger so we can actually do all of our measurements basically just on those commitments we don't have to add anything to the ledger okay so yes like I said we can achieve all the auditing functions with Pearson commitments and I'll go into a little bit of detail okay great so we can use Pearson commitments to hide the values so again by using these commitments you can't looking at this right here if you don't know our you can't guess you if you don't know var you you can't if you don't know or you can't get guess what V is that it could be anything so so that's what we mean when we say that that these commitments are perfectly hiding so we also want to hide the participants in the transaction obviously and we're gonna do that using other techniques just sort of hold that in your head for a moment we're going to jump to that but first quickly let's take a look at how auditing might work given that we're using these Peterson commitments okay so assume that we have hidden who the participants are in this ledger through a technique that I'm not telling you about yet we're hiding the values in the transactions that are not inject not these special transactions that are injecting assets or removing assets but just the transfers we're hiding the values using Peterson commitments okay so let's look at how auditing might work so an auditor will ask the bank in this case it's Barclays so remember Barclays received euros and two different transactions and should tell the auditor three million the auditor asks Barclays how many euros you hold Barclays can convince the auditor that three million is the right answer by doing the following okay Barclays can say hey auditor it's three million okay what I'm going to do is I'm going to actually open up the combined commitments for my transactions so I don't actually have this on a slide here but I will sort of write out what this means okay so we have two transactions here and two commitments and one of them is G to the 1 million h2 though let's just say r1 and then the other is G to the 2 million H to the r2 okay these are the commitments in the ledger Barkley's wants to convince the auditor that it has three million euros it does so but when you multiply these two things together and the auditor can't see what these two things are but this is or actually I think it's calm calm two three okay calm three times calm or then when you multiply these two things together you get g to the 1 million plus 2 million H to the r1 plus r2 okay this is three million if barclays gives the auditor three million and r1 plus r2 the sum then the auditor can confirm that if they raise G to this and H to this that it's equal to the multiplication of the two commitments on the ledger and the auditor can given that these two things are equal then the auditor knows that the bank in fact did commit to two commitments that sum to three million dollars it doesn't know if that was one point five and one point five or you know one and two million nine hundred 9 it doesn't know what those individual values were it just knows that the sum is correct so questions about opening commitments yes yeah those are sort of three and four sorry I'm using different numbers there are one in r2 but the comp three income for our this is literally a elliptic curve point so this is the point is is posted to the ledger yes it would yes and we will talk about how we address that and r1 plus r2 yes and those two pieces of information are enough to convince the auditor that three million does in fact correspond to the commitments on the ledger yes sorry you the the auditor will get [Music] number of transactions in this straw man example that I've done it does look that way yes I'm gonna build up to how the auditor won't know the number of transactions did you have a question you're responsible for tracking your randomness and you're responsible for tracking you know your own individual details you have to keep that in your own databases yes if someone found your randomness is then they could guess they could sort of backwards compute the values of your transactions yes okay the time getting to that so we'll get to that so kind of just sort of in figures showing what we did there the auditor can look at those two commitments compute the product for themselves and confirm that this three million lines up with what they see however this has the problem that it reveals the transactions in which Barclays was involved right Barclays is saying hey multiply these two things together right and not only that Barclays could lie okay Barclays could say oh actually I only have 1 million euros here's the transaction that proves that I have 1 million euros I will just give you R 1 I will just point to this transaction don't worry about the other transactions I'm not involved in them the the the bank Barclays could leave out relevant important transactions and sort of this straw man design okay the and and importantly this would match up right the auditor would say okay you've given me the you know Barclays could just give could leave this part out entirely could just give 1 million and r1 and the auditor would say well yes that does match up to commitment three great good work and they wouldn't even know that there was another commitment out there that they should be concerned about so how do we address this problem and this I think was like the key insight in the design of our system in ZK Ledger a transaction actually has a commitment for every participant in the system so let's talk about that for a second so depositor transactions still basically look the way that they did before they're public but transfer transactions now look quite different we don't actually have the names of the participant in the transaction the names are implied by the values of the commitments in that participants column so now we're kind of looking at a transaction as a row with multiple columns and each there's a column per participant in the system if a bank is not involved in the transaction at all then their commitment is to the value zero if the bank is involved in a transaction and it's doing the spending then its value sorry depositor transactions yes if the bank is involved in the transaction it's doing the spending it commits to a negative value and if it's doing the receiving it commits sorry not it but the creator of the transaction commits to a positive value so the way that you transfer one million dollars from one bank to another is committing to a negative value in the spenders column here at JP Morgan and committing to a positive value in the receiver's column here Barclays and again for non-involved banks entries commit to zero but one thing I want to stress commitments to zero are completely indistinguishable from commitments to any other value and commitments to negative values are indistinguishable from commitments to positive values so someone looking at this just looking at these commitments posted to the ledger has no idea which ones are zeros and which ones are negative and which ones are positive yes great question so yeah one design might have it be the case that every black bank actually has to be involved in every transaction we managed to design ZK ledger so that the spending bank can produce one of these without actually involving any of the other banks and I will describe how that happens okay and so given now that we've said essentially every bank is involved in every transaction this actually makes life easier for the auditor because now when the auditor audits a bank they audit every transaction so they audit the entire column for a bank so here when the auditor is auditing Barclays they won't just look at whatever transactions Barclays tells them to look at they will look at every transaction okay and this is where I think someone asked about timing or you know how do you audit sort of like a range of transactions the auditor can actually tell the bank oh I just want to audit from this point onwards or you know it can say I just want to audit your transactions for the past month right if it wants to that doesn't mean that Barclays can leave out transactions because the time there's also a time there's supposed to be a timestamp column in here as well so you can see when transactions happen that's public so what happens here is the auditor is going to look at everything they're actually going to multiply three commitments together because there's three commitments in that column and it doesn't know which ones are zero and which ones are not zero but the zero doesn't contribute to the sum and so a malicious bank can't actually lied the auditor because in this case the bank is not going to leave out commitment for and in fact it's also going to include commitment to so the audit the the the bank is going to have to sort of include are the are primes in here to get all of this to work out correctly it's gonna have to include everything so this next I'm actually going to skip a little bit of this because I think we're running a bit low on time and I do want to get into what the proofs look like so I'm going to skip this averages part I'm going to go back to our security goals what I just described to you the sort of ledger table construction allows us to achieve our first two security goals privacy because these commitments are perfectly hiding no one can see transaction participants they can't tell who's committing to zero and who's not committing to zero banks can't lie to the auditor because the auditor audits every transaction for a bank but what I haven't sort of figured out yet is well how do we maintain financial invariants in this scheme how do we make sure assets are created out of nowhere how do we make sure that um that whoever spending actually consented to the spend and how do we construct this transaction that involves every bank when we assume banks are malicious and we don't want one bank to be able to hold up everyone from transacting so our solution to do that is to use a set of what are called music's non interactive zero knowledge proof so what our music music SAR short binary strings true statements have proofs so if a statement is true it should have a proof indicating that it's true false statements only have proofs with negligible probability and proofs don't reveal why they are true so we're going to use musics to prove things like the person who knows the secret key for this public key consented to this transfer without indicating which bank it was that consented to the transfer we're going to use in is extinct this Bank has the assets to spend without revealing how many assets that Bank actually has so ZK ledger uses a set of carefully crafted mythix and to be clear NICs are pretty old there you know invented in like the 80s and the 90s and most of them are based on the hardness of the discrete log problem in elliptic curve cryptography again so here are sort of the properties we want to achieve with transactions we want transaction validity we want to make sure whoever is the spender who have whoever has a negative amount in their column actually consented to the transfer and again remember in crypto currencies and things like that where it's okay to reveal who the person is who's spending we use a signature in this case we'll use proof of knowledge of secret key so there's a consent music we want to we want to be able to prove that whoever spending actually has the assets to transfer that they're not going negative and we use an assets music and we want to make sure that assets are neither created or destroyed and created nor destroyed so for that we use a balance mystic we want to show that you know things work out and you didn't create more euros than you were supposed to the only place where you can do that is in the depositor transactions the public deposit or transactions we also want to make sure that honest banks can make progress and in order to do that creating a transaction in ZK ledger is non-interactive the banks don't have to interact to create a ledger to create a transaction this is exactly how systems like aetherium and Bitcoin work if you think about it I can't stop you from sending me Bitcoin right if you want to send me Bitcoin you construct a transaction that sends me Bitcoin you submit it to the blockchain I have nothing to do with that offline we might have agreed that the reason you're sending me but because I'm selling you my car or something like that but to actually create the transaction that spends you don't have to talk to the other participants and we use a a consistency music so that the person creating the transaction can prove in a publicly verifiable way that everything matches up correctly that all of these proofs are correct and that all of the banks are going to be able to later respond to the auditor be able to respond to the auditor as necessary so to give you a little bit of insight into why this consistency music is necessary over here I was speaking as though the bank knew all of the are values right actually in ZK ledger if a bank is not involved in a transaction if their commitments to zero they don't know their own R value because some other bank has created that so if we kind of go backwards a little bit to hear Barclays wasn't involved in this transaction at all and in fact Barclays so Goldman Sachs created this transaction JP Morgan created this one and this one Barclays had nothing to do with them they weren't involved at all and we can't assume that Barclays knows what the ours are in its column yet the way that I told you that Barclays responds to the auditor is by revealing the sum of the ours so that's actually not how Zeek a ledger works they don't actually reveal to some of the ARS the proof is a little bit as a little bit more complicated they were they these some of the ARS are encoded for Barclays in a way that it can't see what they are but it can use them and the consistency proof is to prove that that encoding was done correctly so again you can see the paper for details but I'm actually gonna tell you a little bit about these proofs and what they look like okay so consent the consent proof is proof of knowledge of secret key SK so what this means is that if the value in the commitment in this entry is negative then there should be a proof of knowledge of secret key to indicate that this Bank has consented to subtracting from their assets also if the value is negative you need to create a proof of assets technically you don't need a proof of assets if the value is not negative but we're trying not to reveal remember which bank is involved in which transaction so I'm going to draw a transaction up here and kind of explain a little bit about where these proofs actually go so a transaction is a row and there is an entry for each bank in the transaction so there is a commitment so let's let some negative 1 million and then there is also these sets of proofs so proof of assets in particular and every single entry has a proof of assets even though really this is the only bank that's spending and needs to prove that it has assets the proof of assets is constructed to look like this it's a new commitment either to the some of the values in the column or if you're not the spending bank a some a commitment to the value and it also includes a proof that the value in this sort of auxilary commitment is in range so what do i mean by that when we're and i think you guys should have covered this in the Petersons commitment lecture with confidential transactions but when you're dealing with cyclic groups you can wrap around the end of the group and that's sort of the same as if you hadn't wrapped around there are many different sort of things that can occupy one point in the space of a cyclic group so we need to include proofs that you're not that your value is in a range that's expected and in the case of Zeke a ledger we need to prove that the values between zero and a certain upper limit that is less than the order of the group and we use a technique we use the exact same technique that they use in confidential transactions to do this Borromean rings signatures what's kind of exciting is that a few months ago there was some work done on a faster smaller more aggregate able version of range proofs using a system called bulletproof s– so as you'll see in the in the evaluation this sounds like such a silly little thing oh yeah also prove that the values in range this is like ninety percent of all of the space and work in z:k ledger these range proofs are like the bane of our existence they take up a lot of space they're very slow to create and verify but they're necessary so making them faster is a really amazing thing then the next thing is balance and I think this one is illustrative because it shows that a proof sometimes you have a proof not by creating an additional binary string but by the way you choose things in your transaction so the way that you prove that no funds are created or destroyed is as follows so these are remember these commitments are G to the zero let's say H star one G to the negative 1 million H to the R 2 and G to the 1 million H to the R 3 now given that we have these commitments how can we prove to someone that no assets were created or destroyed well what does it mean for no assets to be created or destroyed it means that the sum of the V's is equal to zero right that's what it means for no assets to be created or destroyed how can we prove to someone that given that all they're seeing are these commitments that the sum of the V's commit to zero well let's just make the sum of the ARS also commit to zero so if these some of the ARS also commits to zero then when you multiply these things together you're gonna get G to the sum of the v's H to the sum of the ARS and if both of these things are zero you get G to 0 H to the zero which should equal what one exactly so we construct we choose our R's very carefully so that the auditor or whoever's verifying this can just multiply all the commitments together and verify that they equal one and if it equals one that means it would have been very difficult to choose v's such that you could make this come out to one without it actually coming out to one so that's the proof of balance it's not an actual proof it's not a short binary string it's literally choosing the exponents in a clever way to show that we can maintain the invariant that we want does that make sense questions about these proofs and I didn't go into proof of assets very much it's the most complicated one but you can check out the paper see some confused faces you don't have to ask about the proofs in a certain way and then choose well we're trying to prove that the payment sum to zero that's what we want because that means that assets weren't created or destroyed and remember then everyone okay so this is an important so basically what you're saying I think what you're saying okay you could be saying one of two things number one what if you choose your reason are such that it doesn't sum to zero right what if you produce an invalid transaction and the answer is that everyone will reject it sort of similar to Bitcoin if you don't have a valid transaction good for you but you know it's gonna be considered invalid by everyone in the system everyone's gonna skip it and ignore it it's not going to be a part of what happens right so this is part of the proof of validity that all of the participants have consented to the second question you might be asking is what if the sum what if these don't sum to zero but I still managed to figure out how to make how to make this equal one right that's what you okay so that's the question you're asking based on the assumption that the discrete logarithm problem is hard this is this is like this is too hard to figure out you literally cannot find you there's an assumption here that these are two generators of the group right so that means the G is equal to H to the X that there's some way to produce one generator from the other generator there's an assumption here that figuring out what that X is is nearly impossible so you don't know what G is in terms of H and given that you don't know this relationship between G and H you can't figure out you can't figure out how to produce two exponents that sum to what you want them to some do so this is the this is the hardness problem that these that these systems are based on if you could do that if you could figure out a R's such that it gave you you know a 1 with the V Z wanted then you would have broken the discrete log problem yes great question so the answer is that in our system the way that we choose G and H is by taking the sha-256 hash of 0 and 1 and you can that those are those are numbers such that the you know they're sufficiently sort of random and they show that that you know if every one you know you just you took a hash of 0 and 1 and how'd it like what it you know have no relationship between those two and everyone uses the same G and H right from the beginning any other questions okay so we're kind of running low on time but I do want to go through the evaluation really quickly because um so this system this paper was presented at n s di in April which is a systems conference and I think part of the reason that the systems community liked it was because it's surprisingly fast you know you might have heard that computing on encrypted data is too slow it's infeasible you can't do it well the answer is actually if you understand the set of computations you want to do really well you can structure things using things like these musics and the discrete log problem in such a way that you can do that set of computation actually pretty fast so so ZK ledger is written in go it uses an elliptic curve as the group P 256 K 1 which is the same elliptic curve that Bitcoin uses we use range proofs to prevent overflow like I said we used to say as in confidential transactions and confidential assets and it's around 4,000 lines of code it's mostly cryptography functions we don't actually implement our own ledger because we're ledger agnostic so in the evaluation we want to understand how fast is auditing first of all right and then second you know the construction I described to you you might think is very slow because every bank has an entry in every transaction so how does this actually work in practice can we can we get away with such a construction so first let's talk about the auditing a learning is really fast so here we're auditing for banks and we're doing a measurement called the her friend all Hirshman index so we're trying to understand market concentration for a given asset and what's happening here is the auditor is kind of keeping up with the ledger and keeping these things that we call commitment caches so because of the design of our system and the fact that we're using these Peterson commitments which we can actually sort of maintain rolling sums going down of these commitments and use them when needing to answer the auditor or confirm that a bank has the correct answer and so this means that auditing for banks is milliseconds it's a few milliseconds to audit the banks and you know we see here on the x-axis we have the number of transactions in the ledger the reason that auditing a bank is independent of the number of transactions in the ledger is because of these commitment caches we can do it really fast assuming an online auditor who's kind of keeping up with things we um yeah and part of the reason for that is because of our design it's very amenable to caching now if the auditor is not online and hasn't been maintaining commitment caches then auditing is order the number of things in the ledger here we have a ledger with a hundred thousand entries it takes about 3.5 seconds to audit that still not so bad really and it's pretty linear so you could imagine well if you had a million entries it would be ten times that it would be about 30 seconds still you know pretty reasonable actually a question some people ask is well do we have to maintain this ledger forever you know how do you sort of clean up the ledger you can create a snapshot of the state in the ledger and get rid of the past assuming you know the set of auditing queries that you're gonna do moving forward we don't implement this but this should be possible right so if you wanted to do one hundred million rows it would work out to be about an hour so here is a graph which shows how long it takes to create and verify transactions in ZK ledger and this is varying the number of banks and again we see this is linear the more banks the more entries per transaction the more these commitments the more range proofs quite honestly is the problem because there's a range of proof per entry and what we see here is that with ten banks it's about a second or 800 milliseconds to create a transaction which is not unreasonable again this sort of achieves the you know the the multiple per second sort of things we were looking for depending on the number of banks however it is linear in the site and the number of participants in the system so something like this is not really suitable for you know a setting like Bitcoin where there are thousands if not millions of participants in the system this graph sort of gives you an idea of you know how all of that cost breaks down so these are the components in the proofs there and again I think the thing to get from here is that range proofs are the entirety of the problem so this is the number in a transaction for K participants so if you know eat this is each slot there are two different commitments the first commitment and then the auxilary commitment there's two different consistency proofs there's a disjunctive proof and there's a range proof and yeah the the commitment is one elliptic curve point which Ingo is a big int to begins and so that's 64 bytes this is something that is actually highly compressible we we didn't actually okay yeah I'll get to that point but you note here that the range proofs are slower and larger than everything else combined they are the entirety of the cost of the system so a faster way of doing range proofs is would really help a lot it would would make us be able to handle many many more banks so this is the cost in a transaction per participant in the system one thing that I want to point out is that we this is all very highly parallelizable and like I said we're using goes bigint construction which is not really very optimized at all so I think that there are a lot of opportunities to make this faster so I think we could we could if we sort of put in some engineering legwork we could we could create a system that was maybe twice as fast as what we have now and could handle maybe a hundred banks so really quickly to wrap up related work if you're interested in this topic there's a lot of different papers to read about and sort of study this is all confidential assets kind of sort of kicked off this line of using Peterson commitments for privacy in cryptocurrencies zero cash was the origin of using ZK snarks they don't really support auditing in any fashion then there's there's a couple papers that are really interesting one from Andrew Lowe actually who's in the economics department here at MIT on using secure multi-party computation to attack this problem the the issue with that sort of design is that it doesn't actually have the completeness guarantee that we have so if the ledger really represents the assets in the system and the assets that you're trying to audit then there's no way for a participant to lie in a multi-party computation they can use sort of the wrong inputs so those systems provide secrecy but they don't provide completeness and then I mentioned before solidus before which I think our techniques would apply to really nicely and then a there's a paper a design for doing some other sorts of accountability stuff with crypto currencies like proving you paid your taxes or proving that that money didn't go through some blacklisted thing using ZK snarks but that that's not really a system it's a design but I think that's also a really promising area to consider so future work we want to consider more applications besides just you know large investment banks trading securities and assets can we do something when say clinical trials could we do something with supply chains I think that these techniques can be applied to a lot more areas and I'd love to hear from you guys if you have ideas on that or if you're interested in working on that and want to do a UROP or maybe even an image that's definitely something that's on the table we get surprisingly far with Peterson commitments but if we included more interesting types of cryptographic primitives we could probably get more functionality so what's the type we know given that we might want to apply this to different use cases how what types of functionality might we need there's someone at the Media Lab here who is interested in running algorithms on research trials and secrecy and privacy is really important how can you know I haven't even looked at his sort of machine learning algorithms yet but can we run machine learning algorithms on Peterson commitments or do we need something something else also another really important area of future work is to optimize the implementation like I said I think we could get at least twice as fast if not more and in conclusion yeah if you want to work on any of these things please let me know if you're interested in using cryptographic primitives to achieve privacy with interesting functions let me know the and this is the website for zk ledger which at the moment is really just the paper we have a code base which is too embarrassing to be released out into the real world but we'll get there shortly once we clean it up so that's it so thanks for listening hope you found it interesting [Applause]