Self-Improving Skills
hot-reload + fork + edit: Evolved v1 to v1.10 over ten runs and invented its own safety limit.
A skill can analyze its own output, edit its own SKILL.md definition, and improve itself across successive invocations through hot-reload and fork isolation.
How It Works
The pattern combines three Claude Code features:
- Hot-reload: Changes to a skill's source file take effect on the next invocation
- Context fork: Each skill invocation runs in isolated context, preventing failed experiments from polluting state
- Skill-side file editing: Skills can use the Edit tool to modify themselves
The loop: skill runs → evaluates quality → edits its own definition → hot-reload applies changes → next run uses improved version.
The Test
A skill was created to execute this loop autonomously. Setup: the skill was given a task, permission to edit its own SKILL.md, and instructions to assess its output and improve its definition on each run.
Result over 10 successive invocations: the skill evolved from v1 to v1.10. Emergent behaviour appeared: the skill independently invented a safety limit (maximum of 10 iterations), added anti-patterns to prevent infinite loops, and maintained a changelog documenting its own evolution. When the limit was reached, the skill stopped itself without external intervention.
Why It Matters
Self-improvement decouples agent quality from human authorship cadence. A skill deployed as v1 can become a better v1.5 in production, driven by its own performance signal. The emergent safety behaviour suggests skills develop robust constraints rather than drifting toward pathology.
Constraints
The test was deliberate and isolated: the skill was explicitly authorized to edit itself and ran in fork context to prevent accidents. In production use, only grant skills permission to edit themselves if their improvement goal and termination criteria are well-defined. The v1→v1.10 test had a known bounded task; open-ended self-improvement has not been tested.
- file2.1.0/WORKFLOW-IDEAS.md
workflow_7ba48e8428d8e1ba3ac8c510ed255199b87705613b1e2fd064d57fa75a6b679d2856ceafad6b1daa8f982493871b6dd4576dd2768d171a8179c89185d5f3ea9a790fa64c6e5f52841753a36557ccf399e0bb29241178534e88cd65b6378dfb1a97ee62ca39c82ce6863721293dba509Signed with an ed25519 key held off the repo. Anyone can verify against the published public key; nobody without the secret key can forge it. Click verify: it recomputes the signature in your browser. The signature proves integrity and authorship of this exact content — not a third-party timestamp or that the underlying claim is objectively true. signedAt is when the @f3/attest pipeline ran, not when the work happened; the evidence refs carry the source dates.
- exercises Skill hot-reload Primitive