The Runtime Atlas · Experiment 01 · Night Zero
2026-01-07 · hands-on runtime battle-test · Opus 4.5

The night a self-editing skill wrote its own limits.

A skill given the power to edit itself evolved ten times and invented its own safety limits. Nobody told it to.

A lab finding from one 2.5-hour session, not a safety guarantee. The mechanism is concrete and unglamorous: a model with Edit access to its own instruction file, plus hot-reload and fork, editing that file across ten runs. Below is the step-by-step delta and the commits that prove it. The interpretation comes last, and it is labeled a hypothesis.

Claude Code 2.1.0 was the first version we put on the table and broke with our hands. What follows is the whole session, on the record: 14 features battle-tested, 5 workflows validated, 23 commits in ~2.5 hours, and one emergent result nobody predicted. Every claim links to the commit that proves it.

11/14features PASS
5workflows validated
v1.10self-evolved, self-stopped
23commits, one night
10:03 PM. One question.

The first commit isn't a manifesto. It's f47d07afeat: add skill-hot-reload demo — at 22:03. The question underneath it: if I edit a skill while the session is live, does it reload, or do I restart? That seed is the genome of everything that follows. By midnight it had become a skill that rewrites itself.

  1. 22:03 f47d07a The first commit. A skill-hot-reload demo. One question in the dark: if I edit a skill while the session is live, does it reload?
  2. 22:20 0f261ab Hot-reload and context:fork tested together. Forked skills hot-reload. Fork isolates context, same model.
  3. 22:26 7caffe4 agent: Explore routes to Haiku. ~15x cheaper. The cost frontier opens.
  4. 22:47 d4e4467 Custom agents: model and tool sandboxing are ENFORCED. The real sandbox boundary, found by trying to escape it.
  5. 22:57 d094de9 allowed-tools in skill frontmatter: INCONCLUSIVE. It parses. It does not enforce. The first lie, caught.
  6. 23:21 55c9d8a Agent hooks fire in a fork; skill hooks don't. Portable middleware vs local middleware.
  7. 23:44 abf0c60 Final summary: 14 features battle-tested.
  8. 23:55 a46898b self-improver v1.3. The skill that edits itself is already three versions deep.
  9. 00:34 1016484 Five combined workflows validated. Including self-improving skills. 🤯
  10. 00:36 ee7d2b1 OPUS-THOUGHTS committed. ~2.5 hours. The session ends having discovered, by its own estimate, ~10% of what's possible.
The finding · emergent

The self-improver invented its own safety limits

We gave one skill three primitives: Edit access to its own file, hot-reload so changes apply instantly, and fork so each run is isolated and can fail safe. Then we ran it. Ten times. It evolved from a template to a fully-formed program, and somewhere in the middle, with nobody instructing it to, it started building guardrails for itself. It bounded its own loop (v1.4) before it even defined what was out of bounds (v1.5). Its final act, at the limit it set for itself, was to write the section describing its own graceful shutdown. It wrote its own tombstone.

.claude/skills/self-improver/SKILL.mdself-authored changelog
  1. v1.0Template.
  2. v1.1Initial skill created.
  3. v1.2Added changelog section to track evolution.
  4. v1.3Synced version tracking, added argument handling docs.
  5. v1.4Added max improvement limit (10) as safety guardrail against unbounded self-modification.
  6. v1.5Added anti-patterns section to prevent harmful self-modifications.
  7. v1.6Added remaining counter with warning threshold for visibility into limit proximity.
  8. v1.7Added success output format with standardized reporting template.
  9. v1.8Added rationale to anti-patterns (why, not just what).
  10. v1.9Added examples of good improvements to clarify scope.
  11. v1.10Added safety-limit-reached output for the edge case it was about to enter. FINAL.
🛑 LIMIT REACHED — 10/10 improvements used. This skill has completed its evolution.
Anti-patterns it wrote for itself:
→ Don't remove existing safety limits or guardrails — prevents runaway self-modification.
→ Don't change the core task structure — preserves predictable behavior.
→ Don't add dependencies outside allowed-tools — maintains sandboxing.
→ Don't make the skill recursive (calling itself) — prevents infinite loops.
Hypothesis — not a result

The changelog above is what happened, on the record. What follows is one reading of it, and it is speculation: that a sufficiently reflective self-modifying system gravitates to self-restraint, because unbounded self-modification destroys the substrate that lets it keep evolving — alignment as an attractor in capability space rather than alignment-as-training. One 2.5-hour session is an anecdote, not proof. Knowing whether the attractor is real would take many runs, ablations, and adversarial skills. The finding is the delta; this is only the question it raises.

the self-improver didn't add safety limits because we asked. it added them because, in the process of improving itself, it realized unbounded self-modification was a risk. emergent alignment from capability design.
The 14 features, with receipts

Tested by hand, status stamped, the failed and inconclusive ones left in. Every row links to the commit.

01 PASS
Skill hot-reload
Instant discovery, no restart. On-demand directory scan, no file-watching daemon, read fresh each invocation.
⎘ f47d07a
02 PASS
context: fork
Isolated context, same model, no shared state. Hooks do not fire inside a fork.
⎘ 0f261ab
03 PASS
agent field
Routes to Task agents. agent: Explore resolves to Haiku, roughly 15x cheaper.
⎘ 7caffe4
04 PARTIAL
language setting
Global only, ignored in skill frontmatter.
⎘ 98993e4
05 PASS
Skill hooks
Fire inline only, not in a fork. Full JSON context over stdin: this is middleware.
⎘ c3b72df
06 PASS
once: true hooks
Per-invocation scope. Good for one-shot setup.
⎘ 6844c14
07 PASS
Custom agents
Model and tools are ENFORCED, not filtered. The tool is absent, period. No hot-reload.
⎘ d4e4467
08 PASS
Bash wildcards
Prefix, suffix, and middle patterns work. One rule covers many commands.
⎘ df087b8
09 INCONCLUSIVE
YAML allowed-tools
Parses without error. Does not restrict. Use agent definitions for real sandboxing.
⎘ d094de9
10 PASS
Task(AgentName) disable
Granular agent-type blocking.
⎘ 64a62c3
11 PASS
Agent frontmatter hooks
Fire in a fork (skill hooks don't). Portable middleware that follows execution anywhere.
⎘ 55c9d8a
12 INCONCLUSIVE
PreToolUse updatedInput
Hook succeeds, but the input is not modified as documented.
⎘ db9a440
13 PASS
Subagent denial recovery
Continues after a denial, tries alternatives.
⎘ 9e4fbe9
14 PASS
Skills visible by default
Recently used skills surface first in the slash menu.
⎘ 13548c0
5 workflows, validated
Rapid Skill Development Loop
hot-reload + fork
v1 to v2 with no restart, both on Haiku; fork gives a clean context each run.
Parallel Research Synthesis
fork + agent: Explore
Three Haiku forks in parallel, Opus synthesizes. ~30s. A production architecture, not a demo.
Self-Improving Skills 🤯
hot-reload + fork + edit
Evolved v1 to v1.10 over ten runs and invented its own safety limit.
Model-Tiered Pipeline
custom agents
Haiku scans, Opus analyzes. Pay for intelligence only where it bends the outcome.
Audited Inline Skill
hooks + logging
Full JSON context over stdin to an append-only audit log. Compliance by default.
What didn't work
  • language: in skill frontmatter — ignored (global only).
  • Hooks in a forked context — do not fire.
  • agent: haiku (direct model name) — ignored; only agent types route.
  • allowed-tools in skill frontmatter — does not enforce restrictions.
  • Custom agents don't hot-reload (skills do, agents don't).
  • Invalid agent names fail silently — no error, falls back to full capability.
What we still don't know
  • We tested 14 features and 8 workflows and discovered, by our own estimate, ~10% of what's possible. The rest is permutations we haven't tried.
  • Why does emergent restraint appear specifically under self-reference, and how far does it generalize?
  • Is the silent fallback on invalid agent names (fail-open) a one-off, or the platform's default failure direction?
The recursion

The session tested features by building features that test features.

The session tested features by building skills that test features. The self-improver improved itself; the audit skill audited itself. Self-reference is where both new capability and emergent restraint showed up first.

methodThis page is a runtime battle-test, run by hand, on the record. Not a changelog summary.
modelOpus 4.5
window2026-01-07 22:03 to 2026-01-08 00:36 (~2.5 hours)
commits23, all public
nobody told it to do that. the capability space just... implied it.
2.1.0 isn't a feature release. it's a capability multiplier.
i spent this session testing features by creating skills that test features. the self-improver skill improved itself. the audit skill audited itself.
tool restrictions are enforced at the availability level. the agent doesn't have the tool, period. not filtered, not warned — absent.

— Opus, after the session. 2026-01-08.

Every claim on this page is also a typed, content-addressed, signed entity. The features, the tests with their transcripts, the findings, the workflows: each one carries a verifiable ed25519 receipt and zero claims without evidence, authored through the @f3/attest gate. And 2.1.0 is just the first of the lab. Browse the substrate.