← 2.1.0 Test tested · runtime-test

Custom agents — runtime test

Hands-on runtime battle-test of Custom agents. Result: PASS.

Custom Agents — Runtime Test

Custom agent definitions allow skills to route execution to specific models with restricted tool sets, verified working for cost and safety optimization.

Test Setup and Execution

The test created a custom agent definition at .claude/agents/cheap-researcher.md specifying Haiku model with Read, Grep, and Glob tools only. A skill then referenced this agent via the agent: cheap-researcher field with context: fork to isolate the execution. The test ran on 2026-01-07 across four independent cases.

What Worked

Test 1 (Custom Agent Routing) passed: skills correctly routed to the specified Haiku model instead of default Opus, and tool restrictions were enforced at the tool availability level. Test 4 (Tool Restriction Enforcement) confirmed that the Bash tool was genuinely unavailable to the forked agent, not merely suggested away. This enables both cost control (cheaper model per task) and security isolation (dangerous tools removed from specific workflows).

What Failed or Broke

Test 2 (Hot-Reload) failed completely: modifying an existing agent definition or adding a new one required a session restart; skills hot-reload but agents do not. Test 3 (Invalid Agent Name) exposed silent failure: referencing a nonexistent agent fell back to defaults with no warning, making typos undetectable at skill definition time.

Definition and Deployment

Custom agents use YAML frontmatter format at .claude/agents/{name}.md with required fields name, description, tools (comma-separated), and model (haiku, sonnet, or opus). The location can be project-local (.claude/agents/), global (~/.claude/agents/), or plugin-scoped. Agent definitions are immutable during a session and cannot be overridden by skill-level configuration once specified.

Status and Limits

The feature passes runtime verification (PASS). Known limitations: no hot-reload during a session, invalid agent names fail silently, and interaction with skill-level allowed-tools is untested. The test rated this a "major unlock for sandboxing and cost optimization."

Primary source
⎘ 2.1.0/tests/06-custom-agents/TEST-RESULTS.mdverbatim from the corpus

Test Results: Custom Agent Definitions

Feature: agent field in skills can reference custom agent definitions

Tested: 2026-01-07

Test Setup

Created .claude/agents/cheap-researcher.md:

---
name: cheap-researcher
description: Cheap research agent using Haiku
tools: Read, Grep, Glob
model: haiku
---
# Cheap Researcher Agent
...

Created skill referencing it:

---
agent: cheap-researcher
context: fork
---

Test Results

Test 1: Custom Agent Routing (fresh session)

Result: PASS

  • Skill routed to Haiku model (not Opus)
  • Tools limited to Read, Grep, Glob
  • Custom agent definition respected

Test 2: Hot-Reload

Result: FAIL - Agents don't hot-reload

  • New agents require session restart
  • Skills hot-reload, agents don't
  • Modification to existing agents also needs restart

Test 3: Invalid Agent Name

Result: Silent failure

  • agent: totally-fake-agent-name doesn't error
  • Falls back to default model/tools
  • No warning surfaced

Test 4: Tool Restriction Enforcement

Result: PASS - Tools ARE enforced

  • cheap-researcher defines only Read, Grep, Glob
  • Bash tool NOT available in forked context
  • Restriction happens at tool availability level, not execution

Key Findings

Custom Agents Enable:

  1. Model selection - Route to specific model tier
  2. Tool sandboxing - Restrict available tools
  3. Reusable configs - Define once, use in multiple skills

Custom Agent Definition Format:

---
name: agent-name
description: What this agent does
tools: Read, Grep, Glob, Edit  # comma-separated
model: haiku | sonnet | opus
thinking: on | off  # optional
---

# Agent Instructions
...

Location:

  • Project: .claude/agents/
  • Global: ~/.claude/agents/
  • Plugin: plugins/*/agents/

Limitations:

  • No hot-reload (restart required)
  • Invalid names fail silently
  • Can't override model in skill once agent is specified

Use Cases

  1. Cheap Research Agent: Haiku + Read/Grep/Glob only
  2. Safe Edit Agent: Sonnet + Edit/Read (no Bash)
  3. Full Power Agent: Opus + all tools
  4. Specialized Agent: Specific tools for specific domain

Interaction with Other Features

Feature Works with Custom Agents?
context: fork YES - forked agent uses custom config
skill hooks UNTESTED
allowed-tools in skill UNTESTED - may override?

Status: PASS

Custom agents work. Major unlock for sandboxing and cost optimization. Caveat: No hot-reload.

Evidence & receipt
◇ ed25519 receipt
idtest_c6e3fd09823c673277c53045
alged25519
pubkey9b87705613b1e2fd064d57fa75a6b679d2856ceafad6b1daa8f982493871b6dd
sig5d31133869fb91104d13f1ad36ad000447ee3e24d419ef76b789f21e9377c9cec7e81fa928edd84ac522dba351b0e0a48aa5e392eb2c3e07fd68522b6f9ed400

Signed with an ed25519 key held off the repo. Anyone can verify against the published public key; nobody without the secret key can forge it. Click verify: it recomputes the signature in your browser. The signature proves integrity and authorship of this exact content — not a third-party timestamp or that the underlying claim is objectively true. signedAt is when the @f3/attest pipeline ran, not when the work happened; the evidence refs carry the source dates.

Connected