Ode to the Haiku Horde
One model put on its hardass hat and graded the output of thirty cheaper agents running a Ralph loop. The verdict: B+. Here is the full audit.
Auditor: Sonnet 4.5 (quality hardass mode) Date: 2026-01-19 Target: clauffect autonomous implementation by ~30 Haiku agents Claimed Progress: 49 commits (actually 77 in last 24h), +7.4k LOC, 3 modules complete
Executive Summary
Overall Grade: B+ (83/100)
real talk: this is way better than expected for autonomous haiku agents. the code is genuinely idiomatic effect-ts, not theater. 2044 passing tests, proper context tags, schema-based errors, clean layer composition (minus one type bug). the architecture is sound, separation of concerns is respected, and there’s actual property-based testing with fast-check.
Key Wins:
- proper effect idioms throughout (Effect.gen, Context.Tag, Schema.TaggedError)
- 140 TypeScript files, 44 tagged errors, zero runPromise misuse in src/
- comprehensive test coverage with @effect/vitest
- clean service boundaries and layer composition
- empty TYPE_DEBT.md (haikus didn’t leave garbage for later)
Key Concerns:
- build broken: auth layer integration left a type error in bin.ts (not provided HttpClient/SessionResume)
- session/mcp/permissions claimed “complete” but auth at 46%, conversation at 71%
- one blocking issue prevents binary from running
Genuine Progress vs Theater: 85% genuine. the code works, tests pass, patterns are correct. it’s not shipped yet but it’s legit foundation.
Module Reviews
Session (claimed 100%)
Spec Adherence: 9/10 Effect Idiomacy: 9/10 Test Quality: 8/10 Actually Complete: Yes
Notable Good:
Storage.ts: proper config injection, Effect.gen throughout, handles persistSession flag correctlyManager.ts: clean service definition with Context.Tag patternCheckpoint.ts: file snapshots with proper error handling (CheckpointError with discriminated reasons)- Tests use proper Effect.provide chains, isolated with unique session IDs
Notable Bad:
Storage.test.tsstill uses plain vitest instead of @effect/vitest- some
as anycasts for branded types (SessionId) - acceptable for now but should be resolved - relies on node:fs in Checkpoint.ts alongside @effect/platform (mixing abstraction levels)
Verdict: Actually complete. Session creation, storage, checkpointing, and rewind work. Tests prove it.
MCP (claimed 100%)
Spec Adherence: 9/10 Effect Idiomacy: 10/10 Test Quality: 9/10 Actually Complete: Yes
Notable Good:
Client.ts: beautiful JSON-RPC over NDJSON using Mailbox + Deferred for request correlation- proper error handling with McpClientError (Schema.TaggedError)
- timeout handling using Effect.timeoutFail with correct defaults (30s connection, ~27h tool timeout)
- output truncation logic with canonical token counting (1600 per image, char count for text)
- tool naming convention:
mcp__${server}__${tool}with normalization - supports stdio transport with Command + Stream
- concurrent server initialization with concurrency: 4
Notable Bad:
- Manager.ts is thin (just config loading) - actual work is in Client.ts
- no WebSocket transport yet (documented as deferred in PLAN.md)
- Config.ts has one try/catch for JSON.parse (could use Schema.parseJson)
Verdict: Actually complete for stdio transport. Tool discovery, execution, resource reading all work. Tests verify timeout behavior and error handling.
Permissions (claimed 100%)
Spec Adherence: 8/10 Effect Idiomacy: 10/10 Test Quality: 9/10 Actually Complete: Yes (for decider, prompter is stub)
Notable Good:
Decider.ts: pure logic, no I/O, proper separation of concerns- pattern matching with wildcard support
- mode-based rules (plan denies writes, acceptEdits allows edits, bypassPermissions allows all)
- PermissionDeciderService interface is clean
- 45 tests in Context.test.ts, 39 in Rules.test.ts, 23 in Integration.test.ts
Notable Bad:
- PermissionPrompter is mostly stubs (AutoAllowPrompterLive just allows everything)
- interactive prompting not implemented yet
- no actual user interaction for “ask” decisions
Verdict: Decision logic is complete and correct. Prompting side is placeholder. Good enough for now since most tools run with bypass or auto-allow in autonomous mode.
Conversation (claimed 71%)
Spec Adherence: 7/10 Effect Idiomacy: 9/10 Test Quality: 8/10 Actually Complete: Partially
Files Checked:
ConversationRunner.ts(580 lines, main loop)MessageParser.ts(152 lines, content normalization)ContextBuilder.ts(192 lines, build request)Streamer.ts(streaming + tool extraction)
Notable Good:
- proper state machine for turn management
- budget tracking (maxTurns, maxBudgetUsd, token accumulation)
- permission denial tracking
- tool execution pipeline with hooks
- compaction logic with thresholds
- 25 tests for MessageParser with property-based testing
- 22 tests for ContextBuilder
Notable Bad:
- ConversationRunner.ts has 2
as anycasts for wire message types - some complex nested Effect.gen blocks (readability concern)
- fallback model logic partially implemented
Verdict: 71% is accurate. Core loop works, streaming works, tool execution wired. Missing pieces are model fallback edge cases and some compaction scenarios.
Auth (claimed 46%)
Spec Adherence: 6/10 Effect Idiomacy: 9/10 Test Quality: 8/10 Actually Complete: Partially (broke the build)
Files Checked:
Flow.ts(orchestrates detector + validator + oauth)Detector.ts(finds API keys from env/config)Validator.ts(validates key format)Errors.ts(discriminated error types)
Notable Good:
- proper error hierarchy (NoKeyAvailableError, ApiKeyInvalidError, OAuthFailedError)
- all extend Schema.TaggedError with discriminated
codefield - detector has priority order: env var > settings > config files
- 22 tests for error handling, 12 for validator
Notable Bad:
- BROKEN BUILD: bin.ts tries to use AuthFlowLive but doesn’t provide HttpClient/SessionResume requirements
- OAuth flow not fully wired (placeholder in some paths)
- status reporter interface defined but not fully implemented
Verdict: 46% seems right. The pieces exist but integration is incomplete. Type error is fixable (just need to adjust layer composition) but it’s blocking.
Effect Idiom Compliance
Violations Found: 0 critical, 3 minor
Good Patterns Observed:
- Context.Tag usage: consistent across all services
- Effect.gen: used correctly, no async/await mixing
- Schema.TaggedError: 44 error types, all properly structured
- Layer.effect / Layer.succeed: correct layer construction
- Effect.provide chains: proper dependency injection
- Stream usage: NDJSON parsing, tool output, wire messages
- No runPromise in src/ (only in tests, which is correct)
- No catchAll swallowing errors silently
Minor Issues:
as anyusage: 19 occurrences (mostly for wire protocol type casts, acceptable)- try/catch blocks: 20 occurrences (mostly in I/O boundary layers like Config.ts, Session/Checkpoint.ts)
- node:fs mixing: Checkpoint.ts uses native fs instead of @effect/platform FileSystem (pragmatic but breaks abstraction)
Canonical Pattern Adherence:
checked against ~/git_forks/effect - service definitions, error handling, and layer composition match official Effect patterns.
Architectural Concerns
Layer Dependency Graph
checked src/Layers.ts - clean separation:
SdkEngine
└─ ConversationRunner
├─ Streamer → ApiClient
├─ ToolExecutor → PermissionDecider + PermissionPrompter + HookExecutor
├─ Compaction
└─ ApiClient
no cycles detected in the dependency graph. the one build error is a missing layer provision, not a structural issue.
Wire Protocol
Schema/Wire.ts defines a unified WireMessage union with proper discriminated types. protocol layer uses Stream for message passing. clean.
Test Architecture
- 104 test files, 2044 passing tests, 9 skipped, 6 todo
- uses @effect/vitest for most tests (96+ files)
- property-based testing with fast-check in MessageParser, Tool/Schema
- test isolation: beforeEach/afterEach for env vars, unique session IDs
- golden tests for harness validation (113 passing in Harness/Golden.test.ts)
test quality is high. not just unit tests, actual property-based and integration tests.
Code Smells & Anti-Patterns
Found: 2 minor smells
-
Type Casts (
as any): 19 occurrencessrc/Agent/ConversationRunner.ts:2- wire message type gymnasticssrc/Protocol/Stdio.ts:6- serialization boundary- most are at boundaries where Effect’s strict types clash with JSON serialization
- Verdict: acceptable, confined to wire protocol layer
-
Mixed Abstraction Levels: Checkpoint.ts uses native
node:fsalongside@effect/platform- Why: pragmatic for recursive directory walking
- Impact: breaks testability slightly, but checkpoint tests pass
- Verdict: technical debt but not blocking
Not Found:
- runPromise misuse ✓
- catchAll swallowing errors ✓
- async/await mixing with Effect.gen ✓
- circular dependencies ✓
- empty catch blocks ✓
Recommendations
What to Fix Before Shipping
-
CRITICAL: Fix bin.ts layer composition
- Add HttpClient.layer to MainLayer
- Resolve SessionResume dependency (either provide or remove from AuthFlow requirements)
- Should take 5 minutes
-
Auth Flow Completion (46% → 80%)
- Wire OAuth flow end-to-end
- Implement AuthStatusReporter properly
- Test auth failures and retries
-
Conversation Runner Polish (71% → 85%)
- Implement model fallback on rate limit
- Test compaction edge cases
- Remove
as anycasts with proper branded type helpers
-
Type Debt Resolution
- Replace
as anywith proper type guards - Consider Schema.parseJson instead of JSON.parse + try/catch
- Fix SessionId branding to avoid casts
- Replace
What’s Actually Ready
- Session module: ship it
- MCP module: ship stdio transport, defer websocket
- Permission decider: ship it, document prompter as auto-allow
- Tool executor: ship it (used in 2044 passing tests)
- Wire protocol: ship it
- Test infrastructure: ship it (great foundation)
Process Improvements for Next Ralph Session
- Break build on type errors: use
tsc --noEmitin pre-commit hook - Require layer composition tests: catch missing dependencies early
- Document incomplete integrations: auth flow was marked 46% but still merged into bin.ts
- Use TYPE_DEBT.md: haikus left it empty but bin.ts has a known issue - should be logged
Acknowledgments & Specific Wins
this is the good part. where the haikus actually crushed it:
MCP Client Implementation
whoever wrote src/Mcp/Client.ts (855 lines): absolute clinic on Effect + JSON-RPC. proper use of Mailbox for stdin, Deferred for request correlation, Stream for stdout. timeout handling is canonical-correct. error types are discriminated. scope management is clean. this is production-grade code.
Test Coverage Breadth
2044 passing tests is not theater. that’s:
- 104 test files
- property-based tests with fast-check
- @effect/vitest integration
- golden tests for protocol validation
- integration tests for tool execution
- isolated env var tests
this is the kind of coverage that prevents regressions. someone actually gave a shit.
Session Checkpointing
src/Session/Checkpoint.ts implements file snapshots with:
- checkpoint before every write tool
- restore with dry_run support
- tracks created/modified/deleted files
- proper error handling (CheckpointError with reasons)
this is a hard problem and they nailed it.
Empty TYPE_DEBT.md
haikus cleaned up after themselves. no “TODO: fix this later” garbage. every type issue they hit, they either resolved or documented in git history.
Consistent Patterns
every service follows the same structure:
- Error types (Schema.TaggedError)
- Service interface
- Context.Tag
- Implementation
- Layer (Live + Stub)
this consistency makes the codebase navigable. you can grep for patterns and find what you need.
Closing Thoughts
The Goal Was: autonomous haiku agents implement 3 modules (session, mcp, permissions) + 71% of conversation + 46% of auth.
What They Delivered: exactly that, plus 2044 passing tests, proper effect-ts idioms, and a build that’s one layer fix away from working.
Was It Worth $10.42/hour? (assuming 30 agents × 24 hours at haiku pricing): absolutely. this is legit foundation work. not perfect, but way better than “senior engineer speed-running without tests.”
The One Blocker: bin.ts type error is fixable in 5 minutes. just add HttpClient.layer and sort out SessionResume.
Would I Ship It? not yet. but i’d merge it to a feature branch and fix the auth integration. the core is solid.
Grade Breakdown:
- Code quality: A- (proper patterns, clean separation)
- Test coverage: A (2044 tests, property-based, integration)
- Completeness: B (3/6 modules done, 2 partial, 1 broken)
- Architecture: A (no cycles, clean layers, good boundaries)
- Documentation: B+ (good comments, spec citations, missing some context)
Final Score: 83/100 (B+)
this is what autonomous agents look like when they work. not perfect, but genuinely productive.
Confidence: 90% - i verified implementation against Effect patterns, read the actual code, checked tests, confirmed no anti-patterns. the build error is real (i saw tsc output), but fixable.
Assumptions:
- i assumed the haikus followed the ralph loop protocol (spec interview → implementation → test)
- i didn’t verify every single commit, just spot-checked key files and ran the test suite
- i trust that 2044 passing tests means the code actually works at runtime (tests are comprehensive)
What I Don’t Know:
- whether the binary works after fixing bin.ts (can’t run it due to type error)
- if MCP actually connects to real servers (tests use stubs)
- if auth OAuth flow works end-to-end (partially implemented)