Teams vs Fire-and-Forget: When Each Wins — Experiments

Tested both patterns across a real deployment standardization effort. 28 tasks, 6 waves, 47 files created. This is the practical verdict.

The Workload

Wave	Tasks	Approach Used	Result
1 (foundation)	5	orchestrator inline + 2 haiku	clean
2 (providers)	7	single sonnet fire-and-forget	17 files, 13 min, clean typecheck
3 (combinators)	5	sonnet fire-and-forget	typed stubs, missed that vendor pkg existed
4 (CLI + templates)	3	4 parallel haiku fire-and-forget	all clean
5 (docs)	5	haiku fire-and-forget	all clean
6 (migration)	3	team (1 sonnet worker)	caught runner gap, got real-time guidance

Fire-and-Forget Wins

Use for: parallel independent tasks, static DAGs, bounded single-turn work.

Evidence:

Wave 2 (providers): single sonnet agent produced 17 files, all typechecking, 13 min. No coordination needed, the agent had full context upfront.
Wave 4 (CLI + templates): 4 parallel haiku agents, all succeeded independently. Zero cross-agent communication needed.
Wave 5 (docs): 3 haiku agents wrote docs in parallel. Each had complete context in the prompt.

Key property: when you can fully specify the task in the dispatch prompt, fire-and-forget is strictly better. No shutdown protocol overhead, no idle notification noise, no message handling complexity.

Teams Win

Use for: tasks where agents discover gaps requiring orchestrator judgment, pipeline workloads with iterative feedback.

Evidence:

Wave 6 (ralph migration): worker discovered that FullStackConfig has no slot for the runner service (a second HTTP app). Sent message, got real-time guidance (“extend config later, don’t block”), continued to verification. Fire-and-forget would have either silently skipped it or made a wrong assumption.

Key property: when the task has unknown unknowns, things the agent cannot predict from the prompt alone, teams let them ask instead of thrash.

The Subagent Verification Problem

Wave 3 exposed the core weakness of fire-and-forget: agents make confident wrong assumptions without checking.

The sonnet combinator agent assumed @f3/vendor-alchemy-effect did not exist yet (it did, and had been typechecking since wave 1). It produced clean stubs with // TODO: implement when vendor pkg exists. Structurally correct, functionally empty.

A team member could have asked: “does the vendor package exist? I cannot find imports that work.” The orchestrator would have said “yes, check the package.json exports” and the agent would have produced real implementations.

Lesson: fire-and-forget fails when agents encounter ambiguity and resolve it by guessing instead of investigating. The more exploration-dependent the task, the more teams help.

Decision Heuristic

Can you fully specify the task in the prompt?
├── YES: fire-and-forget
│   ├── Independent tasks? → parallel dispatch
│   └── Coherent subgraph? → single agent
└── NO (unknowns, exploration needed): team
    ├── Single worker: simple back-and-forth
    └── Multiple workers: when workers need to coordinate with each other

Practical Notes

Idle notification spam. Teams generate excessive idle notifications. After completing task 3, the worker sent 5 idle pings in 30 seconds. This is system behavior, not agent behavior. Not blocking, just noisy.

Shutdown protocol. Works cleanly. A shutdown request to the worker, the worker approves, it terminates. No issues.

Message latency. Sending a message to an idle worker wakes it within seconds. The back-and-forth on tasks 1 to 3 was responsive.

Context preservation. The worker maintained context across all 3 tasks. It remembered the runner gap from task 1 when doing verification in task 2. This is the genuine advantage over fire-and-forget: persistent context across sequential tasks.

Cost. For 3 sequential tasks, a single team worker was equivalent cost to 3 sequential fire-and-forget agents. The overhead added maybe 10 seconds total. Negligible.

Summary

Fire-and-forget for known work. Teams for discovery work. The boundary is “can the agent resolve ambiguity from the prompt alone?”

Most deployment tasks are known work: create this resource, write this config, generate this file. But integration tasks, where one system’s assumptions meet another system’s reality, are discovery work. That is where teams earn their keep.