Blockchain history: the trilemma
Not the CAP theorem, I mean the other trilemma. No, not the impossibility trinity either, the other one.
Scott Alexander recently noted that there’s a lot of hostility towards cryptocurrencies. I think he did well shining a light on some of the positive economic and regulatory aspects. I figured I’d try to give another slice of the story from a research perspective.
I spent some time in 2017 as a research scientist for a blockchain startup. At the time, the biggest problem with blockchains was transaction time. The popular thing to do was compare a blockchain’s transaction throughput with Visa’s credit card transactions per second. There were numerous solutions proposed, all of which seemed to have severe limitations. Solutions that were scalable and secure inevitably leaned away from decentralization. Solutions that were scalable and decentralized inevitably sacrificed security. Solutions that were decentralized and secure had terrible scaling properties.
These observations gave rise to the “Blockchain Trilemma,” which stated that blockchains had to choose to sacrifice either scalability, security, or decentralization. The trilemma was well enough accepted that if you asked someone at a blockchain startup which two they chose to keep, chances were pretty good that they’d have a definitive answer along with a canned justification.
As far as I can tell, we were the first group, in industry or academia, to reject the trilemma. We recognized— I think correctly— that it was the result of an enormous engineering hurdle, not any theoretical limitation. We were quite public about it, and it eventually caught on.
Blockchains
A blockchain is a decentralized canonical history. “Canonical” implies that constructing a blockchain requires consensus. “Decentralized” places constraints on what kinds of consensus are acceptable. “History” implies some minimum outcome, namely an ordering of events.
Of all the things one can use a powerful primitive to construct, why history?
The main reason is that a canonical history enables constructing a canonical execution trace. In computer science, any software can be represented as a machine that observes the state of the world, optionally updates the state of the world based on what it sees, and repeats forever. As it continues doing this, it leaves a sequence of states in its wake. This sequence of states is called an execution trace. An execution trace enables you to do two things:
Determine the current (latest) state. This can be any software state whatsoever.
Verify that the state transitions— changes from one state to another— were all valid.
If you’ve heard the term “ledger” associated with Bitcoin, it refers to the current canonical state of Bitcoin. That state keeps track of how much currency is owned by each user. The state for other blockchains, like Ethereum, keeps track of much more data than just cryptocurrency holdings.
The notion of a decentralized canonical history appears in at least one other interesting context. Your brain consists of a hundred billion neurons acting semi-autonomously. Somehow, in a decentralized way, they seem to arrive at a canonical history, namely your memory of the past. If that’s true, then maybe there’s some level of abstraction at which blockchains and the brain do similar things to arrive at a canonical history.
With normal software, your computer typically executes one tiny instruction at a time, and each state transition is small. With a blockchain, anyone that wants to make a state change submits a proposal (transaction), and “the network” collects valid transactions into a single giant state transition. Depending on the blockchain, this can happen every few minutes or multiple times per second.
Given a state transition, the network needs to somehow decide whether to accept that or one of the many alternatives that a chaotic decentralized network might produce. This is done through a consensus algorithm. There are many options, each providing different security and efficiency characteristics. They fall roughly into the classes proof-of-work, proof-of-stake, phase-change, and not-a-good-idea.
Each consensus algorithm comes with a notion of quorum, or agreement on a state transition. Quorum intersection gives the general requirement for canonicalizing a history. Abstractly, quorum intersection states that there need to be guaranteed-valid agreements on an open cover (my terminology) of the history if that history is to be considered canonical. In the case of execution traces, that means two participants can only glue together their validated execution trace pieces along state transitions. This is easiest to show by example.
Let’s say you have honest— and non-buggy— participants Alice, Bob, and Charlie.
Respectively, they have each agreed on the execution trace pieces
{A→B→C→D}
(Alice),{C→D→E}
(Bob), and{E→F}
(Charlie).Alice and Bob have both agreed on the state transition
{C→D}
, so they can glue together their adjacent trace pieces to get a canonical execution trace piece{A→B→C→D→E}
.Charlie has agreed on state
E
with Bob, but he has not agreed on any state transition with Bob. So Charlie can’t glue his execution trace piece together with Bob’s to get another canonical execution trace piece.
The point of the quorum intersection requirement is to make sure there’s enough data to glue together pieces of the history without allowing inconsistencies. Without quorum intersection, bad actors can inject data in a way that tricks honest participants into seeing different “canonical” execution traces, which should never happen.
In topology (math), a cover of
X
is a bunch of sets that collectively includes everything inX
. An open cover does the same thing using open sets. An open set is effectively one where, no matter where you stand within the set, every “nearby” point is also in the set. It’s analogous to an open interval, like(1.3,10)
, which contains every number between 1.3 and 10 excluding 1.3 and 10 themselves. No matter which point you pick in that interval, it’s always surrounded by other points in the interval. To satisfy quorum intersection, you effectively treat execution trace pieces as open intervals, and you can only use the union of intervals if they overlap.
Pooling together update proposals, gathering consensus, and gluing together agreed-upon data. I suspect an analogous version of these problems must be solved to create anything that looks like a decentralized canonical history.
Cryptocurrencies
Blockchains and cryptocurrencies are inseparable.
You can’t have a functioning cryptocurrency without a blockchain because you can’t keep track of a decentralized virtual currency without the equivalent of a decentralized ledger. Virtual currencies can be cloned like any other digital data, and so they are susceptible to “rollback,” where someone can clone the virtual currency and pretend a transaction never took place. The only way to prevent rollback is to keep track of some canonical state of who-owns-what. That is precisely what a ledger provides.
You also can’t have a functioning blockchain without a cryptocurrency. Like other digital data, histories can be cloned, and people can always choose to ignore the canonical history in favor of an older copy. Blockchains are kept secure by constructing incentives that encourage people to use the canonical history, as opposed to some other outdated or invalid history. These incentives need to be fungible across participants to work in a decentralized setting. Any instantiation of a fungible incentive is going to behave like a currency. Any decentralized currency is going to behave like a cryptocurrency.
Blockchains and cryptocurrencies are inseparable. This is very strange.
Smart contracts
People usually think of smart contracts as “programs that run on a blockchain.” A smart contract is better described as a set of constraints on how the canonical state can be updated. This is easier to explain by example.
Alice proposes a transaction to add “Alice calls the
requestLoan
smart contract with parameter1 ether
” to the canonical history.The network picks up the proposal. It executes the
requestLoan
smart contract, passingAlice
and1 ether
as the parameters.The
requestLoan
smart contract generates two additional state changes: one for adjusting Alice’s loan balance sheet, and one for updating the ether ledger to transfer 1 ether from the smart contract to Alice.Both of these additional state changes are automatically bundled with Alice’s transaction. The network will accept either all of these state changes or none of these state changes.
Here’s a variant.
Alice wants to add “Alice calls the
requestLoan
smart contract with parameter1 ether
” to the canonical history.Alice executes the
requestLoan
smart contract on her own computer with the appropriate parameters. She calculates the resulting state changes.Alice bundles all of the state changes into a single transaction for the blockchain.
The network validates that the transaction is valid, meaning that its proposed state changes do actually correspond to a correct invocation of the
requestLoan
smart contract. The proposal will only get bundled into a state transition if it is valid.
The latter approach is made possible by some very impressive breakthroughs in cryptography (zero-knowledge proofs). Many breakthroughs in zero-knowledge proofs were driven by blockchain research.
As far as the canonical history is concerned, both approaches are equivalent. The good blockchains supporting the first kind of smart contracts tend to also support various sorts of zero-knowledge proofs within their smart contract system. This enables them to automatically support the second kind of smart contracts.
The important thing about smart contracts is that they enable the blockchain’s canonical state to be structured in complex ways. That complexity is only limited by the smart contract system of a blockchain. I’m fairly confident that most of the good ones will end up supporting arbitrarily complex smart contracts. How is that possible?
Suppose it takes Alice a minimum of 10 units of computation to run a smart contract with the parameter “1”, 100 units of computation to run the contract with the parameter “2”, 1000 units to run it with “3”, and so on. In the process, Alice can generate some proof data to demonstrate that she performed the computation correctly. How much time do you think the network requires to verify the proof data?
It’s constant time, constant memory regardless of what the smart contract is. Even if it takes Alice an infinite amount of time and memory to run a smart contract (supposing she somehow manages), the network can validate the proof data just as quickly as if it took 1 unit of computation and 1 unit of space.
Once again, this is very strange. I don’t think the full version of this has been implemented in any blockchain yet.
Transactions
Usually transaction “speed” refers to some combination of transaction latency and throughput. Latency is about the time it takes to finalize a transaction (i.e., know with certainty that it will be included in the canonical history). Throughput is the number of transactions that can be added to the canonical history per second. There are various scenarios where these two measures become decoupled. For example, if you accept multiple transactions in parallel, latency doesn’t change, but throughput increases.
In deep learning, if you want to make a network “smarter,” the easiest approaches involve cramming more data into the neural network weights and using more weights. In blockchain, if you want to make a network “faster",” the easiest approaches involve cramming more transactions into quorums (quora?) and generating more quorums.
You can cram an arbitrarily large number of transactions into each quorum thanks to ZK (zero-knowledge) rollups. The idea here is similar to what can be done with smart contract verification. Anyone can generate a proof that some collection of transactions is valid, and the network only needs to validate the resulting (small) proof to automatically verify all of the associated transactions.
You can generate quorums faster by using a consensus protocol with deterministic processing times, unlike proof-of-work which requires waiting around for some hash-based lottery. The problem with deterministic consensus algorithms is that once someone compromises a deterministic consensus protocol, they can use their position to influence the next round of consensus, thus hijacking the protocol indefinitely. This problem has likely been solved since 2017, but if not, it can be effectively solved by using something like proof-of-work to periodically generate many new parameters for many (but not infinite) deterministic consensus quorums. You would still suffer the cost of proof-of-work occasionally, but only occasionally.
You can generate multiple quorums in parallel by splitting the network into groups (shards) and having each shard individually run a consensus protocol. There’s not really a need to have many thousands of people coming together to form consensus on each individual state transition. Given a cryptographically random split, you generally only need several hundred participants per shard if you want to secure each shard about as well as the main network itself. This gets complicated when the state relevant to a transaction is split across multiple shards, but not too much more complicated assuming you’re willing to adopt new data ownership models for smart contracts.
You can speed up transaction times without speeding up quorum by writing a commitment to the canonical history. As an example, a commitment can be something like “if Alice and Bob both cryptographically sign X
, then Alice agrees that it’s okay to call the smart contract S
with the parameters {Alice, X}
”. With that, Bob can get guarantees about what will get accepted in a future quorum without having to go through a separate quorum for each state change. Once he and Alice know what the transaction should be, they can effectively get the transaction committed long before they actually get it committed to the canonical history.
None of these approaches inherently needs to sacrifice decentralization or security.
Decentralization
I mentioned decentralization several times, but I never actually defined it. The way we thought about it, decentralization implied two properties:
A minimal barrier to maximally useful participation.
Interchangeability between participants. Equivalently, recovery from any expected faulty participation.
Decentralization seems to be necessary, but not sufficient, for scalability of participation. Scalability of participation is easier to explain through examples.
When you’re creating a programming language, you typically want many relevant software libraries from many other languages to be easily usable in your language.
When you’re developing a machine learning architecture, you typically want it to be as easy as possible to integrate ongoing research findings into the resulting model.
When you’re developing a general-purpose blockchain, you typically want it to be as easy as possible for people to contribute to making the network faster, more robust, and more useful.
Abstractly, participation is the thing that operates on a blockchain’s canonical state. Whereas the purpose of smart contract and transaction processing advances is to scale the canonical state and canonical history in various ways, the purpose of decentralization is to scale participation.
Closing notes
Blockchain is strange and fascinating. It is weirdly tangled with cryptography, mechanism design, and distributed computing. The viability and robustness of blockchain is a kind of evidence of a deep connection between these things, and each major blockchain advance seems to elucidate that connection just a little more. Whether you’re new to blockchain or a long-time veteran, hopefully this post gave you some new perspectives on the field.
Yes, I know about closed blockchains. They’re significantly less useful and significantly less interesting.