One Substrate, Many Instruments: A General Framework for Program Analysis

·4 min read

There's a sleight of hand in how people talk about "program analysis frameworks." Having an interpreter and an abstract interpreter is not a framework — it's two tools that happen to read the same language. A framework is the thing that lets you add the third, fourth, and nth instrument without re-describing the language each time. That distinction is the whole subject here.

A substrate plus a bench of instruments

The framework, in one sentence: fix a single executable semantics as the substrate; express every analysis as an instrument that folds that substrate into its own carrier; and bind each instrument back to the substrate with an explicit linking theorem. Adding a tool becomes "fill six slots and discharge one theorem." Nothing is hand-reimplemented, so nothing can drift.

The disease the substrate cures is drift: a hand-written interpreter, a separate hand-written abstract semantics, a third hand-written verifier model — three reimplementations of "the language" that nobody proved equal, diverging quietly over time. There are only two honest ways to make two tools agree: share the semantics literally (agreement is free), or link two descriptions by a theorem (adequacy, simulation, soundness, conformance). Skip both and you get drift.

The instrument interface: six slots

Any analysis method — interpreter, abstract interpreter, symbolic executor, model checker, type system, deductive verifier, proof checker — is the substrate folded through six choices:

  1. Carrier — what the fold lands in (concrete state, an abstract lattice, a symbolic store + path condition, a formula, a proof tree)
  2. Transfer per operator — the interpretation of each constructor in that carrier
  3. Fixpoint strategy — run-to-halt, Kleene + widening, bounded unrolling, a supplied invariant
  4. Query language — what you may ask (one value, "∀ reachable φ", "∃ input reaching L", a Hoare triple, an LTL formula)
  5. Linking obligation — the theorem binding this view to the substrate
  6. Evidence / output — a value, an invariant, a counterexample witness, a proof object

Slots 1–3 make it run; slot 5 makes it trustworthy; slots 4 and 6 make it useful. The payoff of this framing is that it absorbs the apparent rivals: model checking, dataflow analysis and type inference aren't alternatives to abstract interpretation — they're instances of it, differing only in carrier and query, not in kind.

The monad knob: literal sharing

The deepest answer to "how do they all see the same behaviour" is literal sharing. Several instruments can be the same generated code, differing only in one plugged-in module — a monad that decides the mode (how branches combine) and hooks that decide the domain (what + means). Turn the branch knob: an interpreter picks one continuation, an abstract interpreter joins all into a lattice element, a symbolic executor keeps all with their path conditions. Same dispatch, so they cannot disagree on it; soundness reduces to a per-operator hook lemma. This is the result of Abstracting Abstract Machines and Abstracting Definitional Interpreters — and it's the strongest answer to the "same behaviour" question because there's no transcription to get wrong.

Linking theorems are load-bearing

Non-contradiction between instruments is a theorem, never a gift. The species run from weakest trusted assumption to strongest guarantee: conformance (differential testing — checks, doesn't prove) → Galois soundness (α ∘ F ∘ γ ⊑ F♯) and under-approximation soundness (every reported bug is real) → adequacy, simulation/refinement, relative completenesskernel soundness (LCF/de Bruijn) and ultimately by-construction (no dispatch theorem at all). The framework lets you choose, per instrument, how much rigour to buy — and upgrade a link from conformance to by-construction without touching the substrate.


I built this up as a visual, interactive explainer: a six-slot instrument interface, a clickable monad knob that turns one skeleton into five different tools, a filterable bench of fourteen analysis methods each shown as a slot-filling, the full linking-theorem zoo, and an analyzability contract you can "provide" layer by layer to watch which instruments come online.

Open the interactive explainer

It's a level up from the companion piece, one behaviour, many views — where that one asks in what sense are all these tools talking about the same thing, this one turns the answer into a build recipe and a contract. The flashcards below are the TL;DR.

Flashcards Available

This post includes 8 flashcards for spaced repetition. Download the deck to import into Anki.

Download Anki Deck
How to import into Anki
  1. Download the .txt file above
  2. Open Anki and click Import
  3. Select the downloaded file
  4. Choose Basic card type
  5. Ensure fields map to Front and Back
  6. Click Import