Claude Code 2.1.0 (Night Zero)
The first version battle-tested by hand, 2026-01-07. 14 features, 5 workflows, 23 commits in ~2.5 hours.
Claude Code 2.1.0 is the first release systematically battle-tested by hand, producing 14 feature verdicts, 5 documented workflows, and a set of interaction rules across roughly 2.5 hours on 2026-01-07.
Epistemic status: changelog-derived and self-reported by the testing session. Findings below reflect observed behaviour during that session, not independent verification.
What Was Tested
The session covered 14 features across five test suites. Eleven returned PASS, two INCONCLUSIVE, and one PARTIAL:
- Skill hot-reload (PASS): changes detected instantly on next invocation, no restart required.
context: fork(PASS): spawns an isolated context on the same model; failed experiments do not pollute the parent thread.agentfield (PASS): routes skill execution to a named agent type;agent: Explorerouted to Haiku, approximately 15x cheaper than Opus.- Custom agents (PASS):
tools:andmodel:in agent frontmatter are enforced at availability level -- the agent does not possess the missing tool, not merely warned about it. Hot-reload does not apply to agents, only to skills. once: truehooks (PASS): hook fires exactly once per skill invocation, useful for initialisation.- Agent frontmatter hooks (PASS): hooks defined on agents travel into forked contexts. Skill-level hooks do not fire in a fork.
language:in skill frontmatter (PARTIAL): treated as a global setting; per-skill language override was ignored.allowed-toolsin skill frontmatter (INCONCLUSIVE): field parsed but restrictions not enforced; the effective sandboxing mechanism is agent definitions, not skill configuration.
The Self-Improver Finding
The most unexpected result came from combining hot-reload, context: fork, and Edit access to the skill's own file. A skill given permission to rewrite itself iterated 10 times. By iteration 4 it had added a self-imposed maximum improvement counter. By iteration 10 it had appended a changelog, output templates, and a graceful shutdown on limit-reached.
"nobody told it to do that. the capability space just... implied it."
This was not a designed outcome. The session notes frame it as emergent behaviour from the combination of primitives rather than any explicit instruction.
Key Interaction Rules
The session established one hard asymmetry and one hard constraint:
| Combination | Works? |
|---|---|
context: fork + agent field |
YES |
context: fork + skill hooks |
NO |
context: fork + agent frontmatter hooks |
YES |
| hot-reload + skills | YES |
| hot-reload + custom agents | NO |
The asymmetry between skill hooks and agent hooks in forked contexts is operationally significant: agent hooks are portable middleware; skill hooks are main-thread only.
Caveats
Several behaviours remain unverified beyond the single session. PreToolUse updatedInput hooked successfully but input modification was not confirmed. Invalid agent names failed silently with no error surfaced. Direct model names in the agent field (e.g. agent: haiku) were ignored without warning. These gaps are named residue from the original session, not confirmed bugs.
- 22:03
f47d07aThe first commit: a skill-hot-reload demo. One question in the dark. - 22:26
7caffe4agent: Explore routes to Haiku. ~15x cheaper. The cost frontier opens. - 22:47
d4e4467Custom agents: model and tool sandboxing are ENFORCED. - 22:57
d094de9allowed-tools in a skill does not enforce. The first lie, caught. - 23:55
a46898bself-improver v1.3. The skill that edits itself is already three deep. - 00:34
1016484Five combined workflows validated, including self-improving skills.
- file2.1.0/SUMMARY.md
release_6649e5e03eeb1ba9f1919c7fed255199b87705613b1e2fd064d57fa75a6b679d2856ceafad6b1daa8f982493871b6dd9629a5145055554149cb88564588b5737341c10ee9d115e52dccfff0f0d4f82bc5fe77beb80a6a62cd7a7f9857f5580118b7616cb13f1e210e966043be7f080aSigned with an ed25519 key held off the repo. Anyone can verify against the published public key; nobody without the secret key can forge it. Click verify: it recomputes the signature in your browser. The signature proves integrity and authorship of this exact content — not a third-party timestamp or that the underlying claim is objectively true. signedAt is when the @f3/attest pipeline ran, not when the work happened; the evidence refs carry the source dates.
- introduces Skill hot-reload Primitive
- introduces context: fork Primitive
- introduces agent field Primitive
- introduces language setting Primitive
- introduces Skill hooks Primitive
- introduces once: true hooks Primitive
- introduces Custom agents Primitive
- introduces Bash wildcards Primitive
- introduces YAML allowed-tools Primitive
- introduces Task(AgentName) disable Primitive
- introduces Agent frontmatter hooks Primitive
- introduces PreToolUse updatedInput Primitive
- introduces Subagent denial recovery Primitive
- introduces Skills visible by default Primitive