← 2.1.16 Test inconclusive · runtime-test

Combo Test: Skill-Driven Task Templates — runtime test

Hands-on runtime battle-test of Combo Test: Skill-Driven Task Templates. Result: INCONCLUSIVE.

Combo Test: Skill-Driven Task Templates — Runtime Test

Skills can define task graph templates via TaskCreate calls, but no special templating system is needed or present.

Setup

The test created a skill that instructed Claude to instantiate a three-task workflow: Setup (no dependencies), Work (blocked by Setup), Cleanup (blocked by Work). Each task was created using TaskCreate with explicit dependency chains set via TaskUpdate.

What the Test Found

The skill executed as designed—Claude interpreted the workflow instructions and called TaskCreate/TaskUpdate to instantiate tasks. The test marked this "VALIDATED (by design)" because the primitive behavior worked: skills can emit arbitrary task structures, and task dependencies resolve correctly. However, the test revealed that this is not a "template system" in any formal sense. The template is just skill instructions; Claude parses arguments and creates tasks on each invocation. No server-side templating, no schema, no idempotency guarantees.

Key Limitations

The test identified three practical gaps that make this pattern brittle:

  1. No true parameterization — Task ID bindings depend on Claude's argument parsing and session state, not schema validation.
  2. No hot-reload for instances — Refining skill instructions affects only new invocations; existing task graphs remain unchanged.
  3. Unpredictable task IDs — Task identity depends on when TaskCreate succeeds in the current session, making it difficult to reliably reference tasks across skill calls.

Epistemic Status

This test is inconclusive as a runtime battle-test. It confirms that TaskCreate works from skill context, but it does not establish whether skill-driven templates are a viable abstraction for complex workflows. The pattern works for trivial cases but offers no guarantees for parameterization, reuse, or multi-session durability. The honest finding: this is a proof-of-concept, not a production primitive.

Primary source
⎘ 2.1.16/tests/combo-04-skill-templates/TEST-RESULTS.mdverbatim from the corpus

Combo Test: Skill-Driven Task Templates

Hypothesis

Skills can define task graph templates that get instantiated on invocation.

Test Approach

Created a test skill that defines a standard 3-task workflow template.

Results

Status: ✅ VALIDATED (by design)

Skills are just instructions that Claude follows. When a skill says "create these tasks", Claude creates them using TaskCreate. This is trivially true but confirms:

  1. Skills can define arbitrary task structures
  2. TaskCreate/TaskUpdate work from skill context
  3. Hot-reload means templates can be refined

Key Insight

The "template" is just skill instructions. No special templating system needed - Claude interprets the skill and creates tasks accordingly.

Pattern

# My Workflow Skill

When invoked, create these tasks:
1. "Setup: {arg}" (no deps)
2. "Work: {arg}" (blocked by #1)
3. "Cleanup: {arg}" (blocked by #2)

Use TaskCreate for each, then TaskUpdate to set dependencies.

Limitations

  • No true parameterization (depends on Claude parsing args)
  • Hot-reload only updates instructions, not existing tasks
  • Task IDs unpredictable (depends on session state)
Evidence & receipt
◇ ed25519 receipt
idtest_ac9129bd9085a97ec455b46d
alged25519
pubkey9b87705613b1e2fd064d57fa75a6b679d2856ceafad6b1daa8f982493871b6dd
sige510628fc54f116a2392adff2abd7e89ab16d0783aa0c81ce1a489490207b40205e4fea95f4562dbd1894e6e55f6f86e9468b67ff7f63e8465c87655320a6205

Signed with an ed25519 key held off the repo. Anyone can verify against the published public key; nobody without the secret key can forge it. Click verify: it recomputes the signature in your browser. The signature proves integrity and authorship of this exact content — not a third-party timestamp or that the underlying claim is objectively true. signedAt is when the @f3/attest pipeline ran, not when the work happened; the evidence refs carry the source dates.

Connected