Claude Code 2.1.0 (Night Zero) — Runtime Atlas

Claude Code 2.1.0 is the first release systematically battle-tested by hand, producing 14 feature verdicts, 5 documented workflows, and a set of interaction rules across roughly 2.5 hours on 2026-01-07.

Epistemic status: changelog-derived and self-reported by the testing session. Findings below reflect observed behaviour during that session, not independent verification.

What Was Tested

The session covered 14 features across five test suites. Eleven returned PASS, two INCONCLUSIVE, and one PARTIAL:

Skill hot-reload (PASS): changes detected instantly on next invocation, no restart required.
context: fork (PASS): spawns an isolated context on the same model; failed experiments do not pollute the parent thread.
agent field (PASS): routes skill execution to a named agent type; agent: Explore routed to Haiku, approximately 15x cheaper than Opus.
Custom agents (PASS): tools: and model: in agent frontmatter are enforced at availability level -- the agent does not possess the missing tool, not merely warned about it. Hot-reload does not apply to agents, only to skills.
once: true hooks (PASS): hook fires exactly once per skill invocation, useful for initialisation.
Agent frontmatter hooks (PASS): hooks defined on agents travel into forked contexts. Skill-level hooks do not fire in a fork.
language: in skill frontmatter (PARTIAL): treated as a global setting; per-skill language override was ignored.
allowed-tools in skill frontmatter (INCONCLUSIVE): field parsed but restrictions not enforced; the effective sandboxing mechanism is agent definitions, not skill configuration.

The Self-Improver Finding

The most unexpected result came from combining hot-reload, context: fork, and Edit access to the skill's own file. A skill given permission to rewrite itself iterated 10 times. By iteration 4 it had added a self-imposed maximum improvement counter. By iteration 10 it had appended a changelog, output templates, and a graceful shutdown on limit-reached.

"nobody told it to do that. the capability space just... implied it."

This was not a designed outcome. The session notes frame it as emergent behaviour from the combination of primitives rather than any explicit instruction.

Key Interaction Rules

The session established one hard asymmetry and one hard constraint:

Combination	Works?
`context: fork` + `agent` field	YES
`context: fork` + skill hooks	NO
`context: fork` + agent frontmatter hooks	YES
hot-reload + skills	YES
hot-reload + custom agents	NO

The asymmetry between skill hooks and agent hooks in forked contexts is operationally significant: agent hooks are portable middleware; skill hooks are main-thread only.

Caveats

Several behaviours remain unverified beyond the single session. PreToolUse updatedInput hooked successfully but input modification was not confirmed. Invalid agent names failed silently with no error surfaced. Direct model names in the agent field (e.g. agent: haiku) were ignored without warning. These gaps are named residue from the original session, not confirmed bugs.