Skip to content

17 — Capstone: fearless parallelism & byte-identical determinism #26

Description

@rahlk

Part of #9. Phase 3 — the capstone. The chapter the whole book was building to: TRPL
Fearless Concurrency, applied to your own compute-heavy analyzer.

Learning goals

rayon data parallelism, Send/Sync as design constraints (your arena/NodeId choice from
issue 11 pays off HERE — index-based graphs are trivially Send), a Kahn-style ready-queue
scheduler, and the hard discipline: determinism as a tested property, not a hope.

Task

Parallelize by the dependency structure (references/dataflow-construction.md § Parallel execution model):

  • Level-1 per-file build + stages 1–4 per callable: embarrassingly parallel — rayon fan-out
    (par_iter). The whole-program alias solve (such as it is) runs CONCURRENTLY with stages 1–4
    and joins before stage 6.
  • Stages 6–7: ready-queue wavefront over the SCC condensation DAG — per-SCC dependency
    counters; a completed summary decrements dependents, enqueues the ready. The SCC is the atomic
    unit (its internal fixpoint stays on one worker). Strictly better than level barriers.
  • -j/--jobs N: output must be BYTE-IDENTICAL to --jobs 1. The monotone framework makes
    the fixpoint schedule-independent; discovery ORDER is what varies — so never assign ids or
    emit during parallel execution: collect, then sort by (signature, node_id). --jobs 1 stays
    the debug mode forever.
  • Sequential-first: every gate green at --jobs 1 BEFORE parallelizing; the sequential output is
    then the differential oracle.
  • Watch memory, not just CPU: release per-callable CFG/DFG structures once the PDG is emitted.

Gate (determinism gate)

  • --jobs 8 output byte-identical to --jobs 1 on the fixture — for the per-callable fan-out
    AND the SCC wavefront (diff the files in the test).
  • A wall-clock improvement is demonstrated on a real crate (tokio or similar) and recorded in
    the PR description.
  • When this closes, the epic closes: symbol table → call graph → CFG → PDG → SDG → slicing →
    taint → parallel and deterministic. That is a mature analyzer, and you wrote every line.

Metadata

Metadata

Assignees

Labels

learning-ladderThe escalating-complexity curriculum issueslevel-3Native dataflow: CFG/PDG/SDG

Type

No type

Fields

No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions