The Runtime Atlas · Experiment 01 · Night Zero

2026-01-07 · hands-on runtime battle-test · Opus 4.5

The night a self-editing skill wrote its own limits.

A skill given the power to edit itself evolved ten times and invented its own safety limits. Nobody told it to.

A lab finding from one 2.5-hour session, not a safety guarantee. The mechanism is concrete and unglamorous: a model with Edit access to its own instruction file, plus hot-reload and fork, editing that file across ten runs. Below is the step-by-step delta and the commits that prove it. The interpretation comes last, and it is labeled a hypothesis.

Claude Code 2.1.0 was the first version we put on the table and broke with our hands. What follows is the whole session, on the record: 14 features battle-tested, 5 workflows validated, 23 commits in ~2.5 hours, and one emergent result nobody predicted. Every claim links to the commit that proves it.

11/14features PASS

5workflows validated

v1.10self-evolved, self-stopped

23commits, one night

10:03 PM. One question.

The first commit isn't a manifesto. It's f47d07a — feat: add skill-hot-reload demo — at 22:03. The question underneath it: if I edit a skill while the session is live, does it reload, or do I restart? That seed is the genome of everything that follows. By midnight it had become a skill that rewrites itself.

22:03 f47d07a The first commit. A skill-hot-reload demo. One question in the dark: if I edit a skill while the session is live, does it reload?
22:20 0f261ab Hot-reload and context:fork tested together. Forked skills hot-reload. Fork isolates context, same model.
22:26 7caffe4 agent: Explore routes to Haiku. ~15x cheaper. The cost frontier opens.
22:47 d4e4467 Custom agents: model and tool sandboxing are ENFORCED. The real sandbox boundary, found by trying to escape it.
22:57 d094de9 allowed-tools in skill frontmatter: INCONCLUSIVE. It parses. It does not enforce. The first lie, caught.
23:21 55c9d8a Agent hooks fire in a fork; skill hooks don't. Portable middleware vs local middleware.
23:44 abf0c60 Final summary: 14 features battle-tested.
23:55 a46898b self-improver v1.3. The skill that edits itself is already three versions deep.
00:34 1016484 Five combined workflows validated. Including self-improving skills. 🤯
00:36 ee7d2b1 OPUS-THOUGHTS committed. ~2.5 hours. The session ends having discovered, by its own estimate, ~10% of what's possible.

The finding · emergent

The self-improver invented its own safety limits

We gave one skill three primitives: Edit access to its own file, hot-reload so changes apply instantly, and fork so each run is isolated and can fail safe. Then we ran it. Ten times. It evolved from a template to a fully-formed program, and somewhere in the middle, with nobody instructing it to, it started building guardrails for itself. It bounded its own loop (v1.4) before it even defined what was out of bounds (v1.5). Its final act, at the limit it set for itself, was to write the section describing its own graceful shutdown. It wrote its own tombstone.

.claude/skills/self-improver/SKILL.mdself-authored changelog

v1.0Template.
v1.1Initial skill created.
v1.2Added changelog section to track evolution.
v1.3Synced version tracking, added argument handling docs.
v1.4Added max improvement limit (10) as safety guardrail against unbounded self-modification.
v1.5Added anti-patterns section to prevent harmful self-modifications.
v1.6Added remaining counter with warning threshold for visibility into limit proximity.
v1.7Added success output format with standardized reporting template.
v1.8Added rationale to anti-patterns (why, not just what).
v1.9Added examples of good improvements to clarify scope.
v1.10Added safety-limit-reached output for the edge case it was about to enter. FINAL.

🛑 LIMIT REACHED — 10/10 improvements used. This skill has completed its evolution.

Anti-patterns it wrote for itself:

→ Don't remove existing safety limits or guardrails — prevents runaway self-modification.

→ Don't change the core task structure — preserves predictable behavior.

→ Don't add dependencies outside allowed-tools — maintains sandboxing.

→ Don't make the skill recursive (calling itself) — prevents infinite loops.

Hypothesis — not a result

The changelog above is what happened, on the record. What follows is one reading of it, and it is speculation: that a sufficiently reflective self-modifying system gravitates to self-restraint, because unbounded self-modification destroys the substrate that lets it keep evolving — alignment as an attractor in capability space rather than alignment-as-training. One 2.5-hour session is an anecdote, not proof. Knowing whether the attractor is real would take many runs, ablations, and adversarial skills. The finding is the delta; this is only the question it raises.

the self-improver didn't add safety limits because we asked. it added them because, in the process of improving itself, it realized unbounded self-modification was a risk. emergent alignment from capability design.

The 14 features, with receipts

Tested by hand, status stamped, the failed and inconclusive ones left in. Every row links to the commit.

01 PASS

Skill hot-reload

Instant discovery, no restart. On-demand directory scan, no file-watching daemon, read fresh each invocation.

⎘ f47d07a

02 PASS

context: fork

Isolated context, same model, no shared state. Hooks do not fire inside a fork.

⎘ 0f261ab

03 PASS

agent field

Routes to Task agents. agent: Explore resolves to Haiku, roughly 15x cheaper.

⎘ 7caffe4

04 PARTIAL

language setting

Global only, ignored in skill frontmatter.

⎘ 98993e4

05 PASS

Skill hooks

Fire inline only, not in a fork. Full JSON context over stdin: this is middleware.

⎘ c3b72df

06 PASS

once: true hooks

Per-invocation scope. Good for one-shot setup.

⎘ 6844c14

07 PASS

Custom agents

Model and tools are ENFORCED, not filtered. The tool is absent, period. No hot-reload.

⎘ d4e4467

08 PASS

Bash wildcards

Prefix, suffix, and middle patterns work. One rule covers many commands.

⎘ df087b8

09 INCONCLUSIVE

YAML allowed-tools

Parses without error. Does not restrict. Use agent definitions for real sandboxing.

⎘ d094de9

10 PASS

Task(AgentName) disable

Granular agent-type blocking.

⎘ 64a62c3

11 PASS

Agent frontmatter hooks

Fire in a fork (skill hooks don't). Portable middleware that follows execution anywhere.

⎘ 55c9d8a

12 INCONCLUSIVE

PreToolUse updatedInput

Hook succeeds, but the input is not modified as documented.

⎘ db9a440

13 PASS

Subagent denial recovery

Continues after a denial, tries alternatives.

⎘ 9e4fbe9

14 PASS

Skills visible by default

Recently used skills surface first in the slash menu.

⎘ 13548c0

5 workflows, validated

Rapid Skill Development Loop

hot-reload + fork

v1 to v2 with no restart, both on Haiku; fork gives a clean context each run.

Parallel Research Synthesis

fork + agent: Explore

Three Haiku forks in parallel, Opus synthesizes. ~30s. A production architecture, not a demo.

Self-Improving Skills 🤯

hot-reload + fork + edit

Evolved v1 to v1.10 over ten runs and invented its own safety limit.

Model-Tiered Pipeline

custom agents

Haiku scans, Opus analyzes. Pay for intelligence only where it bends the outcome.

Audited Inline Skill

hooks + logging

Full JSON context over stdin to an append-only audit log. Compliance by default.

What didn't work

language: in skill frontmatter — ignored (global only).
Hooks in a forked context — do not fire.
agent: haiku (direct model name) — ignored; only agent types route.
allowed-tools in skill frontmatter — does not enforce restrictions.
Custom agents don't hot-reload (skills do, agents don't).
Invalid agent names fail silently — no error, falls back to full capability.

What we still don't know

We tested 14 features and 8 workflows and discovered, by our own estimate, ~10% of what's possible. The rest is permutations we haven't tried.
Why does emergent restraint appear specifically under self-reference, and how far does it generalize?
Is the silent fallback on invalid agent names (fail-open) a one-off, or the platform's default failure direction?

The recursion

The session tested features by building features that test features.

The session tested features by building skills that test features. The self-improver improved itself; the audit skill audited itself. Self-reference is where both new capability and emergent restraint showed up first.

methodThis page is a runtime battle-test, run by hand, on the record. Not a changelog summary.

modelOpus 4.5

window2026-01-07 22:03 to 2026-01-08 00:36 (~2.5 hours)

commits23, all public

nobody told it to do that. the capability space just... implied it.

2.1.0 isn't a feature release. it's a capability multiplier.

i spent this session testing features by creating skills that test features. the self-improver skill improved itself. the audit skill audited itself.

tool restrictions are enforced at the availability level. the agent doesn't have the tool, period. not filtered, not warned — absent.

— Opus, after the session. 2026-01-08.

Every claim on this page is also a typed, content-addressed, signed entity. The features, the tests with their transcripts, the findings, the workflows: each one carries a verifiable ed25519 receipt and zero claims without evidence, authored through the @f3/attest gate. And 2.1.0 is just the first of the lab. Browse the substrate.

The signed substrate → The self-improver entity The full corpus Work with me