Combo Test: Skill-Driven Task Templates — runtime test
Hands-on runtime battle-test of Combo Test: Skill-Driven Task Templates. Result: INCONCLUSIVE.
Combo Test: Skill-Driven Task Templates — Runtime Test
Skills can define task graph templates via TaskCreate calls, but no special templating system is needed or present.
Setup
The test created a skill that instructed Claude to instantiate a three-task workflow: Setup (no dependencies), Work (blocked by Setup), Cleanup (blocked by Work). Each task was created using TaskCreate with explicit dependency chains set via TaskUpdate.
What the Test Found
The skill executed as designed—Claude interpreted the workflow instructions and called TaskCreate/TaskUpdate to instantiate tasks. The test marked this "VALIDATED (by design)" because the primitive behavior worked: skills can emit arbitrary task structures, and task dependencies resolve correctly. However, the test revealed that this is not a "template system" in any formal sense. The template is just skill instructions; Claude parses arguments and creates tasks on each invocation. No server-side templating, no schema, no idempotency guarantees.
Key Limitations
The test identified three practical gaps that make this pattern brittle:
- No true parameterization — Task ID bindings depend on Claude's argument parsing and session state, not schema validation.
- No hot-reload for instances — Refining skill instructions affects only new invocations; existing task graphs remain unchanged.
- Unpredictable task IDs — Task identity depends on when TaskCreate succeeds in the current session, making it difficult to reliably reference tasks across skill calls.
Epistemic Status
This test is inconclusive as a runtime battle-test. It confirms that TaskCreate works from skill context, but it does not establish whether skill-driven templates are a viable abstraction for complex workflows. The pattern works for trivial cases but offers no guarantees for parameterization, reuse, or multi-session durability. The honest finding: this is a proof-of-concept, not a production primitive.
Combo Test: Skill-Driven Task Templates
Hypothesis
Skills can define task graph templates that get instantiated on invocation.
Test Approach
Created a test skill that defines a standard 3-task workflow template.
Results
Status: ✅ VALIDATED (by design)
Skills are just instructions that Claude follows. When a skill says "create these tasks", Claude creates them using TaskCreate. This is trivially true but confirms:
- Skills can define arbitrary task structures
- TaskCreate/TaskUpdate work from skill context
- Hot-reload means templates can be refined
Key Insight
The "template" is just skill instructions. No special templating system needed - Claude interprets the skill and creates tasks accordingly.
Pattern
# My Workflow Skill
When invoked, create these tasks:
1. "Setup: {arg}" (no deps)
2. "Work: {arg}" (blocked by #1)
3. "Cleanup: {arg}" (blocked by #2)
Use TaskCreate for each, then TaskUpdate to set dependencies.
Limitations
- No true parameterization (depends on Claude parsing args)
- Hot-reload only updates instructions, not existing tasks
- Task IDs unpredictable (depends on session state)
test_ac9129bd9085a97ec455b46ded255199b87705613b1e2fd064d57fa75a6b679d2856ceafad6b1daa8f982493871b6dde510628fc54f116a2392adff2abd7e89ab16d0783aa0c81ce1a489490207b40205e4fea95f4562dbd1894e6e55f6f86e9468b67ff7f63e8465c87655320a6205Signed with an ed25519 key held off the repo. Anyone can verify against the published public key; nobody without the secret key can forge it. Click verify: it recomputes the signature in your browser. The signature proves integrity and authorship of this exact content — not a third-party timestamp or that the underlying claim is objectively true. signedAt is when the @f3/attest pipeline ran, not when the work happened; the evidence refs carry the source dates.
- verifies Combo Test: Skill-Driven Task Templates Primitive