EVM Equivalence

7 Trust Base, Proofs, Gaps and Limitations

Closing the overview of the project, it’s good to have a word on the topics that give its name to this chapter.

7.1 Trust Base

Our assumptions can be stratified into fundamental and operational. Fundamental assumptions are those that constitute the basic assumptions needed for this whole enterprise to make any kind of sense. These are assumptions regarding the correctness of the K framework tooling. Operational assumptions are assumptions or axioms that we might employ in the proving of theorems. These include assumptions that are needed for certain theorems to be true or axioms that should be proven as theorems but have been left without proof due to time constraints.

7.1.1 Fundamental Assumptions

  • K framework: Correct implementation of Matching Logic, of which a deep embbedding already exists formalized in Lean 4.

  • KEVM summarization: The summarization tooling built in KEVM that produces the “compressed” rewrite rules containing the computational content of the opcodes is correctly done.

  • K-to-Lean extraction: The klean tool built into K does a faithful enough conversion of the compiled Kore theory from the evm-semantics.

7.1.2 Operational Assumptions

  • Well-formedness constraints: Due to how the summaries are generated, some of the symbolic parameters are under-constrained. For instance, KEVM words are unbouned integers. The semantics implements checks that ensure these symbolic parameters are indeed within the correct ranges. However, since the summaries extract the computational behavior after the appropriate checks are performed, we have to add these as extra hypotheses.

    Other necessary hypotheses are also added such as there being enough gas to execute the opcode.

  • Unproven results: There is a number of unproven results in which we rely. These range from axioms (which most should be turned into theorems and proven) and theorems with no proofs or missing subgoals.

7.2 Proofs

The overall goal of this project is to prove the equivalence of executing arbitrary EVM programs by bisimulation between Nethermind’s EvmYulLean and KEVM. However, at the moment, we don’t have a formalized notion of bisimulation. Instead, we give intermediate results that should entail bisimulation (or, as of the time of writing, simulation). This is done on a per-opcode basis. We believe showing bisimilarity at the opcode level should guarantee bisimilarity when executing full EVM programs.

7.3 Gaps and Limitations

  • EvmYulLean reasoning interfaces: At the time of making this project, the EvmYulLean model is geared towards execution. This means that the types and overall structure of the model are more execution-oriented than reasoning-oriented, which poses some reasoning challenges that are resolved through our reasoning interfaces.

  • EvmYulLean ffi usage: One particular instance of the execution orientedness of the Lean model is the use of Lean’s ffi. Since ffi functions are declared as opaque and have no explicit computational content at reasoning-time, we have circumvented this by asserting axioms (which at the moment cannot be proven) to provide the missing computational content for reasoning.

  • K-specific idioms when extracting Kore to K: Kore and Lean syntaxes differ quite a bit and both have their own idiosyncrasy. Hence, to reason efficiently with the generated K code we also provide an interface to smooth out the Kore specific patterns and make it more Lean-friendly.

  • Slow compilation turnaround by Lean: The Lean-extracted Kore code is not small. Compiling this code and some other proofs is quite time consuming. At the moment, the average time it takes to compile the project can be around 40 minutes. This means that the “deeper” a change is in the project, the slower the iteration time.