The cost frontier: pay for intelligence only where it bends the outcome
agent: Explore routes to Haiku (~15x cheaper). Three Haiku forks in parallel, Opus synthesizing, is a production architecture, not a demo.
One node in a single signed graph. Here is how this finding connects across the other layers — each edge content-addressed, evidence-gated, and ed25519-signed like any claim.
The cost frontier is the principle that in a multi-agent pipeline, the correct model tier for each step is the cheapest one whose capability is sufficient for that step -- and that this threshold is lower than practitioners expect for the majority of work.
The finding
Runtime testing of Claude Code 2.1.0 confirmed that agent: Explore routes to Haiku by default. Haiku runs at roughly 15x lower cost than Opus. The implication is structural, not incidental: most of the token volume in a real pipeline -- scanning files, grepping patterns, summarising individual results -- does not require frontier reasoning. Opus capability is only required at the synthesis step, where disparate Haiku-produced fragments must be integrated into a single coherent output.
The tested architecture was three Haiku forks running in parallel, with Opus synthesising their outputs. This is described in the source as "not a demo -- that's a production architecture."
How it works
The routing mechanism is the custom agent definition. A YAML stub like:
model: haiku
tools: Read, Grep, Glob
becomes a reusable cost/capability profile. Any skill that references the agent inherits both the model tier and the tool constraints. The profile is declared once and applied everywhere it is referenced, so the cost envelope is set at agent-definition time rather than scattered across skill logic.
The canonical task split observed in testing:
- Haiku scans a codebase for relevant files
- Haiku greps patterns across large file sets
- Haiku summarises individual test results
- Opus synthesises one coherent analysis from all prior outputs
90% of token volume runs at Haiku prices; 10% runs at Opus quality.
Why it matters
The cost frontier reframes how pipeline cost is modelled. Naive cost estimation treats every step as if it requires the highest-capability model. The frontier finding shows that the expensive model is bottlenecked to the synthesis step, and that the synthesis step is small relative to total work. The ratio is not a property of this particular test; it follows from the structure of information-gathering pipelines in general: breadth-first scanning produces many small outputs that a single aggregation step reduces.
This also means agent definitions are cost contracts. Defining cheap-scanner once with model: haiku and constrained tools is a durable commitment that cheap execution applies wherever that agent is referenced. Changing the model tier in one place propagates the cost change across all callsites.
Caveats
The 15x cost differential and the 90/10 split are reported from a single runtime session exploring 2.1.0 features. The session explicitly notes that 14 features and 8 workflows were tested and that only roughly 10% of the possible permutations were explored. The cost ratio is specific to the Haiku/Opus pair as it existed at test time; model pricing changes.
The architecture is validated as functional, not benchmarked for quality degradation. The test confirms the pattern works; it does not measure whether Haiku-produced intermediate outputs lose signal that would have been preserved by a higher-capability model at those steps.
- file2.1.0/OPUS-THOUGHTS.md
- commit7caffe4 ↗
finding_44349dd66a81c41145d9ee45ed255199b87705613b1e2fd064d57fa75a6b679d2856ceafad6b1daa8f982493871b6ddd1b35edbef8397bf3fe05602d64154d6cbe26116c7aec4d60564c41970687e73dd8732778098c7378ca362790434630640c1d37e15ad7177c6553c063f7be106Signed with an ed25519 key held off the repo. Anyone can verify against the published public key; nobody without the secret key can forge it. Click verify: it recomputes the signature in your browser. The signature proves integrity and authorship of this exact content — not a third-party timestamp or that the underlying claim is objectively true. signedAt is when the @f3/attest pipeline ran, not when the work happened; the evidence refs carry the source dates.
- emerges-from agent field Primitive