Professional Documents
Culture Documents
MEMBERS OF OUR TEAM HOLD ETH. THIS STATEMENT IS INTENDED TO DISCLOSE ANY CONFLICT …
Show More
Key Takeaways
Ethereum is the only major protocol building a scalable unified settlement and
data availability layer
Rollups scale computation while leveraging Ethereum’s security
All roads lead to the endgame of centralized block production, decentralized
trustless block validation, and censorship resistance
Innovations such as proposer-builder separation and weak statelessness unlock
this separation of powers (building and validating) to achieve scalability without
sacrificing security or decentralization
MEV is now front and center – numerous designs are planned to mitigate its
harm and prevent its centralizing tendencies
Danksharding combines multiple avenues of cutting edge research to provide
the scalable base layer required for Ethereum’s rollup-centric roadmap
I do expect danksharding to be implemented within our lifetimes
Table of Contents
Introduction
Part I – The Road to Danksharding
1) Original Data Sharding Design – Separate Shard Proposers
2) Data Availability Sampling
3) KZG Commitments
4) KZG Commitments vs. Fraud Proofs
5) In-protocol Proposer-Builder Separation
6) Censorship Resistance List (crList)
7) 2-Dimensional KZG Scheme
8) Danksharding
9) Danksharding – Honest Majority Validation
10) Danksharding – Reconstruction
11) Danksharding – Malicious Majority Safety With Private Random Sampling
12) Danksharding – Key Takeaways
13) Danksharding – Limits to Blockchain Scalability
14) Proto-danksharding (EIP-4844)
15) Multidimensional EIP-1559
Part II – History & State Management
1) Calldata Gas Cost Reduction With Total Calldata Limit (EIP-4488)
2) Bounding Historical Data in Execution Clients (EIP-4444)
3) Recovering Historical Data
4) Weak Statelessness
5) Verkle Tries
6) State Expiry
Part III – It’s all MEV
1) MEV Supply Chain Today
2) MEV-Boost
3) Committee-driven MEV Smoothing
4) Single-slot Finality
5) Single Secret Leader Election
Part IV – The Merge: Under the Hood
1) Clients After the Merge
2) Consensus After the Merge
Concluding Thoughts
Introduction
I’ve been pretty skeptical on the timing of the merge ever since Vitalik said there’s a
50-75% chance people born today live to the year 3000 and he hopes to be
immortal. But what the hell, let’s have some fun and look even further ahead to
Ethereum’s ambitious roadmap anyway.
This is no quick-hitter article. If you want a broad yet nuanced understanding of
Ethereum’s ambitious roadmap – give me an hour of your focus, and I’ll save you
months of work.
Ethereum research is a lot to keep track of, but everything ultimately weaves into
one overarching goal – scale computation without sacrificing decentralized
validation.
Hopefully you’re familiar with Vitalik’s famous “Endgame.” He acknowledges that
some centralization is needed to scale. The “C” word is scary in blockchain, but it’s
true. We just need to keep that power in check with decentralized and trustless
validation. No compromises here.
S i li d t ill b ild bl k f b th th L1 d b Eth i
Specialized actors will build blocks for both the L1 and above. Ethereum remains
incredibly secure through easy decentralized validation, and rollups inherit their
security from the L1. Ethereum then provides settlement and data availability
allowing rollups to scale. All of the research here ultimately looks to optimize these
two roles while simultaneously making it easier than ever to fully validate the chain.
Here’s a glossary to shorten some words that will show up ~43531756765713534
times:
DA – Data Availability
DAS – Data Availability Sampling
PBS – Proposer-builder Separation
PDS – Proto-danksharding
DS – Danksharding
PoW – Proof of Work
PoS – Proof of Stake
Part I – The Road to Danksharding
Hopefully you’ve heard by now that Ethereum has pivoted to a rollup-centric
roadmap. No more execution shards – Ethereum will instead optimize for data-
hungry rollups. This is achieved via data sharding (Ethereum’s plan, kind of) or big
blocks (Celestia’s plan).
The consensus layer does not interpret the shard data. It has one job – ensure the
data was made available.
I’ll assume familiarity with some basic concepts like rollups, fraud and ZK proofs,
and why DA is important. If you’re unfamiliar or just need a refresher, Can’s recent
Celestia report covers them.
Original Data Sharding Design – Separate Shard Proposers
The design described here has been scrapped, but it’s valuable context. I’ll refer to
this as “sharding 1 0” throughout for simplicity
this as sharding 1.0 throughout for simplicity.
Each of the 64 shard blocks had separate proposers and committees rotating
through from the validator set. They individually verify that their shard’s data was
made available. This wouldn’t be DAS initially – it relies on the honest majority of
each shard’s validator set to fully download the data.
This design introduces unnecessary complexity, worse UX, and attack vectors.
Shuffling validators around between shards is tricky.
It also becomes very difficult to guarantee that voting will be completed within a
single slot unless you introduce very tight synchrony assumptions. The Beacon
block proposer needs to gather all of the individual committee votes, and there can
be delays.
Quick lesson for everyone else who forgot math class. (I promise this won’t be really
scary math – I had to watch some Khan Academy videos to write these sections,
but even I get it now).
Polynomials are expressions summing any finite number of terms of the form cxk.
The degree is the highest exponent. For example, 2x3+6x2+2x-4 is a polynomial of
degree three. You can reconstruct any polynomial of degree d from any d+1
coordinates that lie on that polynomial.
Now for a concrete example. Below we have four chunks of data (d0 through d3).
These chunks of data can be mapped to evaluations of a polynomial f(X) at a given
point. For example, f(0) = d0. Now you find the polynomial of minimal degree that
runs through these evaluations. Since this is four chunks, we can find the
polynomial of degree three. Then, we can extend this data to add four more
evaluations (e0 though e3) which lie along the same polynomial.
Remember that key polynomial property – we can reconstruct it from any four
points, not just our original four data chunks.
Back to our DAS. Now we only need to be sure that any 50% (4/8) of the erasure-
coded data is available. From that, we can reconstruct the entire block.
So an attacker would have to hide >50% of the block to successfully trick DAS
nodes into thinking the data was made available when it wasn’t.
The probability of <50% being available after many successful random samples is
very small. If we successfully sampled the erasure coded data 30 times, then the
probability that <50% is available is 2-30.
KZG Commitments
Ok, so we did a bunch of random samples and it was all available. But we have
another problem – was the data erasure coded properly? Otherwise maybe the
block producer just added 50% junk when they extended the block, and we
sampled nonsense. In which case we wouldn’t actually be able to reconstruct the
data.
Normally we just commit to large amounts of data by using a Merkle root. This is
effective in proving inclusion of some data within a set.
However, we also need to know that all of the original and extended data lie on the
same low-degree polynomial. A Merkle root does not prove this. So if you use this
scheme, you also need fraud proofs in case it was done incorrectly.
assumption (i.e., in addition to someone sending me the fraud proof, I also need
to assume I’m connected and will receive it within a finite amount of time).
2. Ethereum and Polygon Avail are going a new route – KZG commitments (a.k.a.
Kate commitments). This removes the honest minority and synchrony
assumptions for safety in regards to fraud proofs (though they’re still there for
reconstruction, as we’ll cover shortly).
Other solutions exist, but they’re not being actively pursued. For example, you could
use ZK-proofs. Unfortunately they’re computationally impractical (for now).
However, they’re expected to improve over the next few years, so Ethereum will
likely pivot to STARKs down the road because KZG commitments are not quantum-
resistant.
Back to KZG commitments – these are a type of polynomial commitment scheme.
Commitment schemes are just a cryptographic way to provably commit to some
values. The best metaphor is putting a letter inside a locked box and handing it to
someone else. The letter can’t be changed once inside, but it can be opened with
the key and proven. You commit to the letter, and the key is the proof.
In our case, we map all that original and extended data on an X,Y grid then find the
minimal degree polynomial that runs through them (this process is called a
Lagrange interpolation). This polynomial is what the prover will commit to:
If the number of validators online starts to get uncomfortably low, DS can be set to
reduce the shard data blob count automatically. Thus the security assumption
would be brought down to a safe level.
Danksharding – Malicious Majority Safety With Private Random
Sampling
We saw that DS validation relies on the honest majority to attest to blocks. I as an
individual can’t prove to myself that a block is available by just downloading a
couple rows and columns. However, private random sampling can give me this
guarantee without trusting anyone. This is where nodes check those 75 random
samples as discussed earlier.
DS initially won’t include private random sampling because it’s simply a very difficult
problem to solve on the networking side (PSA: maybe they could actually use your
help here!).
Note the “private” is important because if the attacker has de anonymized you
Note the private is important because if the attacker has de-anonymized you,
they’re able to trick a small amount of sampling nodes. They can just return you the
exact chunks you requested and withhold the rest. So you wouldn’t know from your
own sampling alone that all data was made available.
Danksharding – Key Takeaways
In addition to being a sweet name, DS is also incredibly exciting. It finally achieves
Ethereum’s vision of a unified settlement and DA layer. This tight coupling of the
Beacon block and shards essentially pretends not to be sharded.
In fact, let’s define why it’s even considered “sharded” at all. The only remnant of
“sharding” is simply the fact that validators aren’t responsible for downloading all of
the data. That’s it.
So you’re not crazy if you were questioning by now whether this is really still
sharding. This distinction is why PDS (we’ll cover this shortly) is not considered
“sharded” (even though it has “sharding” in the name, yes I know it’s confusing).
PDS requires each validator to fully download all shard blobs in order to attest to
their availability. DS then introduces sampling, so individual validators only
download pieces of it.
The minimal sharding thankfully means a much simpler design than sharding 1.0 (so
quicker delivery right? right?). Simplifications include:
Likely hundreds of lines less code in the DS specification vs. the sharding 1.0
specification (thousands of lines less in the clients)
No more shard committee infrastructure, committees just need to vote on the
main chain
No tracking of separate shard blob confirmations, now they all get confirmed in
the main chain or they don’t
One great result of this – a merged fee market for data. Sharding 1.0 with distinct
blocks made by separate proposers would have fragmented this.
The removal of shard committees also strengthens bribery resistance. DS validators
vote once per epoch on the entire block, so data gets confirmed by 1/32 of the
entire validator set immediately (32 slots per epoch). Sharding 1.0 validators also
voted once per epoch, but each shard had its own committee being shuffled
around. So each shard was only being confirmed by 1/2048 of the validator set
(1/32 split across 64 shards).
The combined blocks with the 2D KZG commitment scheme also makes DAS far
more efficient as discussed. Sharding 1.0 would require 60 KB/s bandwidth to check
full DA of all shards. DS only requires 2.5 KB/s.
Another exciting possibility kept alive with DS – synchronous calls between ZK-
rollups and L1 Ethereum execution. Transactions from a shard blob can immediately
confirm and write to the L1 because everything is produced in the same Beacon
Chain block. Sharding 1.0 would’ve removed this possibility due to separate shard
confirmations. This allows for an exciting design space which could be incredibly
valuable for things like shared liquidity (e.g., dAMM).
Danksharding – Limits to Blockchain Scalability
Modular base layers scale elegantly – more decentralization begets more scaling.
This is fundamentally different from what we see today. Adding more nodes to a DA
layer allows you to safely increase data throughput (i.e., more room for rollups to
live on top).
There are still limits to blockchain scalability, but we can push orders of magnitude
higher than anything we see today. Secure and scalable base layers allow execution
to proliferate atop them. Improvements in data storage and bandwidth will also
allow for higher data throughput over time.
Pushing beyond the DA throughput contemplated here is certainly in the cards, but
it’s hard to say where that max will end up. There isn’t a clear red line, but rather an
area where some assumptions will start to feel uncomfortable:
Data storage – This ties into DA vs. data retrievability. The consensus layer’s
role isn’t to guarantee data retrievability indefinitely. Its role is to make it
available for enough time such that anyone who cares to download it can,
satisfying our security assumptions. Then it gets dumped into storage wherever
– this is comfortable as history is a 1 of N trust assumption, and we’re not
actually talking about that much data in the grand scheme of things. This could
get into uncomfortable territory though years down the line as throughput
increases by orders of magnitude.
Validators – DAS requires enough nodes to collectively reconstruct the block.
Otherwise, an attacker could wait around and only respond to the queries they
receive. If these queries provided aren’t enough to reconstruct the block, the
attacker could withhold the rest and we’re out of luck. To safely increase
throughput, we need to add more DAS nodes or increase their data bandwidth
requirements. This isn’t a concern for the throughput discussed here. Again
though, this could get uncomfortable if throughput increases by further orders
of magnitude from this design.
Notice the builder isn’t the bottleneck. You’ll need to quickly generate KZG proofs
for 32 MB of data, so expect a GPU or pretty beefy CPU plus at least 2.5 GBit/s
bandwidth. This is a specialized role anyway for whom this is a negligible cost of
doing business.
Proto-danksharding (EIP-4844)
DS is awesome, but we’ll have to be patient. PDS aims to tide us over – it
implements necessary forward-compatible steps toward DS on an expedited
timeline (targeting the Shanghai hard fork) to provide orders of magnitude scaling in
the interim. However, it doesn’t actually implement data sharding yet (i.e., validators
need to individually download all of the data).
Rollups today use L1 “calldata” for storage which persists on-chain forever.
However, rollups only need DA for some reasonable period of time such that anyone
interested has plenty of time to download it.
EIP-4844 introduces the new blob-carrying transaction format that rollups will use
f d i f d Bl b l f d ( 125 KB) d
for data storage going forward. Blobs carry a large amount of data (~125 KB), and
they can be much cheaper than similar amounts of calldata. Data blobs are then
pruned from nodes after a month which blunts storage requirements. This is plenty
of time to satisfy our DA security assumptions.
For context on scale, current Ethereum blocks are generally average ~90 KB
(calldata is ~10 KB of this). PDS unlocks far more DA bandwidth (target ~1 MB and
max ~2 MB) for blobs because they get pruned after a month. They aren’t a
permanent drag on nodes.
A blob is a vector of 4096 field-elements of 32 bytes each. PDS allows for a max of
16 per block, and DS will bump that up to 256.
PDS DA bandwidth = 4096 x 32 x 16 = 2 MiB per block, targeted at 1 MiB
DS DA bandwidth = 4096 x 32 x 256 = 32 MiB per block, targeted at 16 MiB
Orders of magnitude scaling with each step. PDS still requires consensus nodes to
fully download the data, so it’s more conservative. DS distributes the load of storing
and propagating data between validators.
Here’s are some of the goodies introduced by EIP-4844 on the road to DS:
Data blob-carrying transaction format
KZG commitments to the blobs
All of the execution-layer logic required for DS
All of the execution / consensus cross-verification logic required for DS
Layer separation between BeaconBlock verification and DAS blobs
Most of the BeaconBlock logic required for DS
A self-adjusting independent gas price for blobs (multidimensional EIP-1559
with an exponential pricing rule)
And then DS will further add:
PBS
DAS
2D KZG scheme
Proof-of-custody or similar in-protocol requirement for each validator to verify
availability of a particular part of the sharded data in each block (probably for
about a month)
Note these data blobs are introduced as a new transaction type on the execution
chain, but they don’t burden the execution side with additional requirements. The
EVM only views the commitment attached to the blobs. The execution layer
changes being made with EIP-4844 are also forward compatible with DS, and no
more alterations will be needed on this side. The upgrade from PDS to DS then only
requires consensus layer changes.
Data blobs are fully downloaded by consensus clients in PDS. The blobs are now
referenced, but not fully encoded, in the Beacon block body. Instead of embedding
the full contents in the body, the contents of the blobs are propagated separately,
as a “sidecar”. There is one blob sidecar per block that’s fully downloaded in PDS,
then with DS validators will conduct DAS on it.
We discussed earlier how to commit to the blobs using KZG polynomial
commitments. However, instead of using the KZG directly, EIP-4844 implements
what we actually use – its versioned hash. This is a single 0x01 byte (representing
the version) followed by the last 31 bytes of the SHA256 hash of the KZG.
We do this for easier EVM compatibility & forward compatibility:
EVM compatibility – KZG commitments are 48 bytes whereas the EVM works
more naturally with 32 byte values
Forward compatibility – if we ever switch from KZG to something else (STARKs
for quantum-resistance), the commitments can continue to be 32 bytes
Multidimensional EIP-1559
PDS finally creates a tailored data layer – data blobs will get their own distinct fee
market with separate floating gas prices and limits. So even if some NFT project is
selling a bunch of monkey land on L1, your rollup data costs won’t go up (though
proof settlement costs would). This acknowledges that the dominant cost for any
rollup today is posting their data to the L1 (not proofs).
The gas fee market is unchanged, and data blobs are added as a new market:
The blob fee is charged in gas, but it’s a variable amount adjusting based on its own
EIP-1559 mechanism. The long run average number of blobs per block should equal
the target.
You effectively have two auctions running in parallel – one for computation and one
for DA. This is a giant leap in efficient resource pricing.
There are some interesting designs being thrown around here. For example, it might
make sense to change both the current gas and blob pricing mechanisms from the
linear EIP-1559 to a new exponential EIP-1559 mechanism. The current
implementation doesn’t average out to our target block sizes in practice. The base
fee stabilizes imperfectly today, resulting in the observed average gas used per
block exceeding the target by ~3% on average.
Part II – History & State Management
Quick recap on some basics here:
History – Everything that’s ever happened on-chain. You can just stick it on a
hard drive as it doesn’t require quick access. 1 of N honesty assumption in the
long term.
State – Snapshot of all the current account balances, smart contracts, etc. Full
nodes (currently) all need this on hand to validate transactions. It’s too big for
RAM, and a hard drive is too slow – it goes in your SSD. High throughput
blockchains balloon their state, growing far beyond what us normies can keep
on our laptops. If everyday users can’t hold the state, they can’t fully validate,
so goodbye decentralization.
TLDR – these things get really big, so if you make nodes hold onto them it gets hard
to run a node. If it’s too hard to run a node, us regular folk won’t do it. That’s bad, so
we need to make sure that doesn’t happen.
Calldata Gas Cost Reduction With Total Calldata Limit (EIP-4488)
PDS is a great stepping stone toward DS which checks off many of the eventual
requirements. Implementing PDS within a reasonable timespan can then pull
forward the timeline on DS.
An easier to implement band-aid would be EIP-4488. It’s not quite so elegant, but it
addresses the fee emergency nonetheless. Unfortunately it doesn’t implement
steps along the way to DS, so all of the inevitable changes will still be required later.
If it starts to feel like PDS is going to be a bit slower than we’d like, it could make
sense to quickly jam through EIP-4488 (it’s just a couple lines of code changes)
then get to PDS say another six months later. Timing is still in motion here.
EIP-4488 has two major components:
Reduce calldata cost from 16 gas per byte to 3 gas per byte
Add a limit of 1 MB calldata per block plus an extra 300 bytes per transaction
(theoretical max of ~1.4 MB total)
The limit needs to be added to prevent the worst case scenario – a block full of
calldata would reach up to 18 MB, which is far beyond what Ethereum can handle.
EIP-4488 increases Ethereum’s average data capacity, but its burst data capacity
would actually decrease slightly due to this calldata limit (30 million gas / 16 gas per
calldata byte = 1.875 MB).
EIP-4488’s sustained load is much higher than PDS because this is still calldata vs.
data blobs which can be pruned after a month. History growth would accelerate
meaningfully with EIP-4488, making that a bottleneck to run a node. Even if EIP-
4444 is implemented in tandem with EIP-4488, this only prunes execution payload
history after a year. The lower sustained load of PDS is clearly preferable.
It’s acceptable that builders still require the state due to PBS – they’ll be more
centralized high-resource entities anyway. Focus on decentralizing validators. Weak
statelessness gives the builders a bit more work, and validators far less work. Great
tradeoff.
You achieve this magical stateless execution with witnesses. These are proofs of
correct state access that builders will start including in every block. Validating a
block doesn’t actually require the whole state – you only need the state being read
or affected by the transactions in that block. Builders will start including the pieces
of state affected by transactions in a given block, and they’ll prove they correctly
accessed that state with witnesses.
Let’s play out an example. Alice wants to send 1 ETH to Bob. To verify a block with
this transaction, I need to know:
Before the transaction – Alice had 1 ETH
Alice’s public key – so I can tell the signature is correct
Alice’s nonce – so I can tell the transaction was sent in the correct order
After executing the transaction – Bob has 1 ETH more, Alice has 1 ETH less
In a weak statelessness world, the builder adds the above witness data to the block
and the proof of its accuracy. The validator receives the block, executes it, and
decides if it’s valid. That’s it!
Here are the implications from the validator perspective:
Huge SSD requirement for holding state disappears – this is the key bottleneck
to scaling today.
Bandwidth requirements will increase a bit as you’re now also downloading the
witness data and proof. This is a bottleneck with Merkle-Patricia trees, but it
would be mild and not the bottleneck in Verkle tries.
You still execute the transaction to fully validate. Statelessness acknowledges
the fact that this isn’t currently the bottleneck to scaling Ethereum.
Weak statelessness also allows Ethereum to loosen self-imposed constraints on its
execution throughput with state bloat no longer a pressing concern. Bumping up
gas limits ~3x could be reasonable.
Most user execution will be taking place on L2s at this point anyway, but higher L1
throughput is still beneficial even to them. Rollups rely on Ethereum for DA (posted
to shards) and settlement (which requires L1 execution). As Ethereum scales its DA
layer, the amortized cost of posting proofs could become a larger share of rollup
costs (especially for ZK-rollups).
Verkle Tries
Verkle Tries
We glossed over how those witnesses actually work. Ethereum currently uses a
Merkle-Patricia tree for state, but the Merkle proofs required would be far too large
for these witnesses to be feasible.
Ethereum will pivot to Verkle tries for state storage. Verkle proofs are far more
efficient, so they can serve as viable witnesses to enable weak statelessness.
First let’s recap what a Merkle tree looks like. Every transaction is hashed to start –
these hashes at the bottom are called “leaves.” All of the hashes are called “nodes”,
and they are hashes of the two “children” nodes below them. The resulting final
hash is the “Merkle root”.
This is a helpful data structure for proving inclusion of a transaction without needing
to download the whole tree. For example, if you wanted to verify that transaction H4
is included, you just need H12, H3, and H5678 in a Merkle proof. We have H12345678
from the block header. So a light client can ask a full node for those hashes then
hash them together per the route in the tree. If the result is H12345678, then we’ve
successfully proved that H4 is in the tree.
The deeper the tree though, the longer the route to the bottom, and thus more
items you need for the proof. So shallow and wide trees would seem good for
making efficient proofs.
The problem is that if you wanted to make a Merkle tree wider by adding more
children under each node, it would be incredibly inefficient. You need to hash
together all siblings to make your way up the tree, so then you’d need to receive
more sibling hashes for the Merkle proof. This would make the proof sizes massive.
This is where efficient vector commitments come in. Note the hashes used in
Merkle trees are actually vector commitments – they’re just bad ones that only
commit to two elements efficiently. So we want vector commitments where we
don’t need to receive all the siblings to verify it. Once we have that, we can make
the trees wider and decrease their depth. This is how we get efficient proof sizes –
decreasing the amount of information that needs to be provided.
A Verkle trie is similar to a Merkle tree, but it commits to its children using an
efficient vector commitment (hence the name “Verkle”) instead of a simple hash. So
the basic idea is that you can have many children for each node, but I don’t need all
the children to verify the proof. It’s a constant size proof regardless of the width.
We actually covered a great example of one of these possibilities before – KZG
commitments can also be used as vector commitments. In fact, that’s what
Ethereum devs initially planned to use here. They’ve since pivoted to Pedersen
commitments to fulfill a similar role. These will be based on an elliptic curve (in this
case Bandersnatch), and they’ll commit to 256 values each (a lot better than two!).
So why not have a tree of depth one that’s as wide as possible then? This would be
great for the verifier who now has a super compact proof. But there’s a practical
tradeoff that the prover needs to be able to compute this proof, and the wider it is
the harder that gets. So these Verkle tries will lie in between the extremes at 256
values wide.
State Expiry
py
Weak statelessness removes state bloat constraints from validators, but state
doesn’t magically disappear. Transactions cost a finite amount, but they inflict a
permanent tax on the network by increasing the state. State growth is still a
permanent drag on the network. Something needs to be done to address the
underlying issue.
This is where state expiry comes in. State that’s been inactive for a long period of
time (say a year or two) gets chopped from what even block builders need to carry.
Active users won’t notice a thing, and deadweight state that’s no longer needed can
be discarded.
If you ever need to resurrect expired state, you’ll simply need to display a proof and
reactivate it. This falls back to the 1 of N storage assumptions here. As long as
someone still has the full history (block explorers, etc.), you can get what you need
from them.
Weak statelessness will blunt the immediate need for state expiry on the base layer,
but it’s good to have in the long run especially as L1 throughput increases. It’ll be an
even more useful tool for high throughput rollups. L2 state will grow at orders of
magnitude higher rates, to the point where it will even be a drag on high-
performance builders.
Part III – It’s all MEV
PBS was necessary to safely implement DS, but recall that it was actually designed
at first to combat the centralizing forces of MEV. You’ll notice a recurring trend in
Ethereum research today – MEV is now front and center in cryptoeconomics.
Designing blockchains with MEV in mind is critical to preserving security and
decentralization. The basic protocol level approach is:
1. Mitigate harmful MEV as much as possible (e.g., single-slot finality, single
secret leader election)
2. Democratize the rest (e.g., MEV-Boost, PBS, MEV smoothing)
The remainder must be easily captured and spread amongst validators. Otherwise,
it will centralize validator sets due to the inability to compete with sophisticated
searchers. This is exacerbated by the fact that MEV will comprise a much higher
share of validator rewards after the merge (staking issuance is far lower than the
inflation given to miners). It can’t be ignored.
MEV Supply Chain Today
Today’s sequence of events looks like this:
Mining pools have played the builder role here. MEV searchers relay bundles of
transactions (with their respective bids) to mining pools via Flashbots. The mining
pool operator aggregates a full block and passes along the block header to
individual miners. The miner attests to it with PoW giving it weight in the fork choice
rule.
Flashbots arose to prevent vertical integration across the stack – this would open
the door to censorship and other nasty externalities. When Flashbots began, mining
pools were already beginning to strike exclusive deals with trading firms to extract
MEV. Instead, Flashbots gave them an easy way to aggregate MEV bids and avoid
vertical integration (by implementing MEV-geth).
After the merge, mining pools disappear. We want to open the door to at-home
validators being reasonably able to operate. This requires finding someone to take
on the specialized building role. Your at-home validator probably isn’t quite as good
as the hedge fund with a payroll of quants at capturing MEV. Left unchecked, this
would centralize the validator set if regular folk can’t compete. Structured properly,
the protocol can redirect that MEV revenue toward the staking yields of everyday
the protocol can redirect that MEV revenue toward the staking yields of everyday
validators.
MEV-Boost
Unfortunately in-protocol PBS simply won’t be ready at the merge. Flashbots comes
to the rescue again with a stepping stone solution – MEV-Boost.
Validators post-merge will default to receiving public mempool transactions directly
into their execution clients. They can package these up, hand them to the
consensus client, and broadcast them to the network. (If you need a refresher on
how Ethereum’s consensus and execution clients work together, I cover this in Part
IV).
But your mom and pop validator has no idea how to extract MEV as we discussed,
so Flashbots is offering an alternative. MEV-boost will plug into your consensus
client, allowing you to outsource specialized block building. Importantly, you still
retain the option to use your own execution client as a fallback.
MEV searchers will continue to play the role they do today. They’ll run specific
strategies (stat arb, atomic arb, sandwiches, etc.) and bid for their bundles to be
included. Builders then aggregate all the bundles they see as well as any private
orderflow (e.g., from Flashbots Protect) into the optimal full block. The builder
passes only the header to the validator via a relay running to MEV-Boost. Flashbots
intends to run the relayer and builder with plans to decentralize over time, but
whitelisting additional builders will likely be slow.
MEV-Boost requires validators to trust relayers – the consensus client receives the
header, signs it, and only then is the block body revealed. The relayer’s purpose is
to attest to the proposer that the body is valid and exists, so that the validators
don’t have to trust the builders directly.
When in-protocol PBS is ready, it then codifies what MEV-Boost offers in the
interim. PBS provides the same separation of powers, allows for easier builder
decentralization, and removes the need for proposers to trust anyone.
Committee-driven MEV Smoothing
PBS also opens the door to another cool idea – committee-driven MEV smoothing.
We saw the ability to extract MEV is a centralizing force on the validator set, but so
is the distribution. The high variability in MEV rewards from one block to another
incentivizes pooling many validators to smooth out your returns over time (as we
see in mining pools today, although to a lesser degree here).
The default is to give the actual block proposer the full payment from the builders.
MEV smoothing would instead split that payment across many validators. A
committee of validators would check the proposed block and attest to whether or
not that was indeed the block with the highest bid. If everything checks out, the
block goes ahead and the reward is split amongst the committee and proposer.
This solves another concern as well – out-of-band bribes. Proposers could be
incentivized to submit a suboptimal block and just take an out-of-band bribe
directly to hide their payments from delegators for example. This attestation keeps
proposers in check.
p p
Nobody will shut up about it, so I feel obligated to at least give it a brief shout out.
This concludes your Ethereum crash-course.
Clients After the Merge
Today you run one monolithic client (e.g., Go Ethereum, Nethermind, etc.) that
handles everything. Specifically, full nodes do both:
Execution – Execute every transaction in a block to ensure validity. Take the
pre-state root, execute everything, and check that the resulting post-state root
is correct
Consensus – Verify you’re on the heaviest (highest PoW) chain with the most
work done (i.e., Nakamoto consensus)
They’re inseparable because full nodes not only follow the heaviest chain, they
follow the heaviest valid chain. That’s why they’re full nodes and not light nodes. Full
nodes won’t accept invalid transactions even in the event of a 51% attack.
The Beacon Chain currently only runs consensus to give PoS a test run. No
execution. Eventually a terminal total difficulty will be decided upon, at which point
the current Ethereum execution blocks will merge into the Beacon Chain blocks
forming one chain:
g
However, full nodes will run two separate clients under the hood that interoperate:
Execution client (f.k.a. “Eth1 client”) – Current Eth 1.0 clients continue to handle
execution. They process blocks, maintain mempools, and manage and sync
state. The PoW stuff gets ripped out.
Consensus client (f.k.a. “Eth2 client”) – Current Beacon Chain clients continue
to handle PoS consensus. They track the chain’s head, gossip and attest to
blocks, and receive validator rewards.
Clients receive Beacon Chain blocks, execution clients run the transactions, then
consensus clients will follow that chain if everything checks out. You’ll be able to
mix and match the execution and consensus clients of your choice, all will be
interoperable. A new Engine API will be introduced for the clients to communicate
with each other:
Alternatively:
Consensus After the Merge
Today’s Nakamoto Consensus is simple. Miners create new blocks, and they add
them to the heaviest observed valid chain.
Post-merge Ethereum moves to GASPER – combining Casper FFG (the finality tool)
plus LMD GHOST (the fork-choice rule) to reach consensus. The TLDR here – this is
a liveness favoring consensus as opposed to safety favoring.
The distinction is that safety favoring consensus algorithms (e.g., Tendermint) halt
when they fail to receive the requisite number of votes (⅔ of the validator set here).
Liveness favoring chains (e.g., PoW + Nakamoto Consensus) continue building an
optimistic ledger regardless, but they’re unable to reach finality without sufficient
votes. Bitcoin and Ethereum today never reach finality – you just assume after a
sufficient number of blocks that a reorg won’t occur.
However, Ethereum will also achieve finality by checkpointing periodically with
sufficient votes. Each instance of 32 ETH is a separate validator, and there are
already over 380,000 Beacon Chain validators. Epochs consist of 32 slots with all
validators split up and attesting to one slot within a given epoch (meaning ~12,000
attestations per slot). The fork-choice rule LMD Ghost then determines the current
head of the chain based on these attestations. A new block is added every slot (12
seconds), so epochs are 6.4 minutes. Finality is achieved with the requisite votes
generally after two epochs (so every 64 slots, though it can take up to 95).
Concluding Thoughts
All roads lead to the endgame of centralized block production, decentralized
trustless block validation, and censorship resistance. Ethereum’s roadmap has this
vision square in its sights.
Ethereum aims to be the ultimate unified DA and settlement layer – massively
decentralized and secure at the base with scalable computation on top. This
condenses cryptographic assumptions to one robust layer. A unified modular (or
disaggregated now?) base layer with execution included also captures the highest
value across L1 designs – leading to monetary premium and economic security as I
recently covered (now open-sourced here).
I hope you gathered a clearer view of how Ethereum research is all so interwoven.
There are so many moving pieces, it’s very cutting-edge, and there’s a really big
picture to wrap your head around. It’s hard to keep track of.
Fundamentally, it all makes its way back to that singular vision. Ethereum presents a
compelling path to massive scalability while holding dear those values we care so
much about in this space.
Layer 1 / L1
Managing Expectations Heading Into "The Merge"
By Hartley Leroy and 2 others • Apr 21, 2022
PRO
Staking
The Race To Become Solana's Liquid Staking Winner
By Ceteris • Apr 20, 2022
PRO
Layer 1 / L1
Pay Attention To Celestia
By Can Gurel • Feb 16, 2022