Model-Tiered Pipeline

A workflow pattern that routes task stages through agents matched to model capability and cost, running cheap scans and expensive synthesis in sequence.

How It Works

Model-tiered pipelines combine two features: custom agent definitions (which pair a model with a specific tool set) and skill routing (which directs a skill to a particular agent). The pattern uses this to construct multi-stage flows where early stages route to cost-optimized models and later stages route to capable ones.

Define agents upfront:

cheap-scanner: Haiku model with Read/Grep/Glob tools only
smart-analyzer: Opus model with full tool access

A skill pipeline then invokes: cheap-scanner finds candidate items → smart-analyzer analyzes findings. This distributes work: 90% of reads/scans run at Haiku cost, 10% of synthesis runs at Opus quality.

What the Test Found

The pattern was validated in runtime testing using custom agent definitions and skill routing. Results confirmed:

Custom agents enforce model and tool constraints consistently
Skills successfully route to specified agents
Tool restrictions are enforced at availability level (skills cannot call unavailable tools)
Cost benefits hold: high-volume scanning at cheap rates, focused analysis at expensive rates

No hot-reload mechanism exists for agent redefinition; agents must be defined upfront before skills invoke them.

Why It Matters

Agent-model pairing is typically a one-per-session or one-per-project choice. This pattern makes it a per-task choice. Scanning, classification, and filtering scale poorly at Opus cost; reasoning and synthesis don't benefit from cheap models. Tiering separates these concerns, pushing Haiku to scale and reserving Opus for decisions that need it.

Caveats

Requires upfront agent definition; cannot be modified mid-session
Tool restrictions are enforced; skills cannot exceed their agent's allowed-tools list
Optimal tier points are task-dependent; no automatic cost tuning