2.1.154 fan-out ★ keystone untested · changelog-derived

Claude Code 2.1.154

Dynamic workflows (`/workflows`): you can now ask Claude to create a workflow and it orchestrates work across tens to hundreds of background agents, turning single-session prompting into fan-out, multi-agent orchestration of large/complex tasks — this is the primitive the whole 2.1.154 test is instrumenting.

7 new primitives 0 workflow recipes 38 fixes

New primitives

dynamic workflows / /workflowscommand

New capability: ask Claude to create a workflow and it orchestrates work across tens-to-hundreds of background agents; `/workflows` is a new command to view runs. This is the headline new orchestration primitive.

Introducing dynamic workflows: ask Claude to create a workflow and it orchestrates work across tens to hundreds of agents in the background, so you can take on larger, more complex tasks. Run `/workflows` to view your runs
/effort xhigh extended to Opus 4.8capability

**NOT new in 2.1.154.** `xhigh` was introduced in **2.1.111** (for Opus 4.7); here it is merely extended to Opus 4.8 (which now defaults to high effort). Listed as an extension, not a net-new primitive. *(This row was originally mis-classified as "new" — the per-version verifier could not see that xhigh predates this release; see OPUS-THOUGHTS.md §6.)*

Opus 4.8 is here! Now defaults to high effort · /effort xhigh for your hardest tasks
claude agents ! <command> / claude --bg --exec '<command>'flag

New capability to run a shell command as an attachable/detachable background session, either via `! <command>` inside `claude agents` or via the new `--bg --exec` CLI flags.

`claude agents`: type `! <command>` to run a shell command as a background session you can attach to and detach from. Also available as `claude --bg --exec '<command>'`
plugin.json defaultEnabled: false / claude plugin enablesetting

New `defaultEnabled: false` setting in plugin.json (or marketplace entry) so plugins ship disabled, plus the new `claude plugin enable` command (and /plugin) to enable them; dependencies of enabled plugins still auto-enable.

Plugins can now declare `defaultEnabled: false` in `plugin.json` or a marketplace entry; enable them with `/plugin` or `claude plugin enable`. Dependencies of enabled plugins are still enabled automatically
CLAUDE_CODE_SESSION_ID / CLAUDECODE env vars for stdio MCP subprocessesenv-var

Stdio MCP server subprocesses now receive `CLAUDE_CODE_SESSION_ID` and `CLAUDECODE=1` in their environment — a new env-var surface an MCP server can read to detect it's running under Claude Code and key off the session.

Stdio MCP server subprocesses now receive `CLAUDE_CODE_SESSION_ID` and `CLAUDECODE=1` in their environment
/chrome → Select browser…capability

New selectable capability to pick which connected browser Claude in Chrome uses, via `/chrome` → "Select browser…" or in-chat when an action runs with multiple connected browsers. Minor.

Claude in Chrome: pick which connected browser to use via `/chrome` → "Select browser…", or in-chat when a browser action runs with multiple connected
Opus 4.8 (incl. fast mode) modelcapability

New model available to select/use, including a fast mode on Opus 4.8. Counted as a capability primitive (new model surface); fast-mode pricing/speed detail itself is an extension of the existing fast-mode mechanism.

Opus 4.8 is here! Now defaults to high effort · /effort xhigh for your hardest tasks

Agentic relevance

2.1.154 pushes the orchestration frontier from one-agent-per-session to fan-out: dynamic workflows let a single prompt spin up tens-to-hundreds of coordinated background agents, while `claude --bg --exec`, the `CLAUDE_CODE_SESSION_ID`/`CLAUDECODE` MCP env surface, and the heavy slate of background-session/worktree-isolation bugfixes harden the substrate that multi-agent fan-out depends on. Combined with Opus 4.8 and `/effort xhigh`, this version is squarely aimed at running larger, more complex tasks as durable multi-agent orchestrations rather than interactive single-threads.

Reflection

Two runs, both via the `Workflow` tool introduced in 2.1.154: - **Test 1 — single version.** A 3-stage *sequential* chain (extract → write → independently verify) pointed at 2.1.154 itself. **3 agents, 98,700 tokens, 14 tool calls, 228 s (3.8 min).** Verdict PASS. - **Test 2 — multi version.** A 3-stage *pipeline* fanned across six versions of the "agentic escalation arc" (2.1.50, 2.1.71, 2.1.111, 2.1.139, 2.1.160, 2.1.169), then an Opus synthesis barrier that read everything off disk and wrote the cross-version arc. **19 agents, 612,342 tokens, 95 tool calls, 491 s (8.2 min).** 6/6 PASS, 0 issues. Both runs were on Opus (the workers inherited the session model). That is the single biggest thing I would change — see "The cost lesson" below. ## The number that sells the primitive Test 2 ran **6.3× the agents of Test 1** (3 → 19) for only **~2.15× the wall-clock** (3.8 → 8.2 min). That ratio *is* the value proposition. Per-agent token spend was nearly identical across both runs (32.9k vs 32.2k tokens/agent), so the win isn't cheaper agents — it's that nineteen of them ran mostly concurrently instead of in series. A naive sequential loop over six versions × three stages would have been ~18 sequential hops; the pipeline collapsed it into roughly the time of the slowest single chain plus the synthesis tail. I *watched* the no-barrier pipeline work: within ~1–2 minutes all six version folders existed on disk (every `write` stage had fired) while the `verify` stages and synthesis were still running. Item A reached stage 3 while item F was still in stage 1 — exactly what `pipeline()` promises and what a `parallel()` barrier between stages would have thrown away. ## What surprised me (the friction, which is the real content) **1. The orchestrator script is blind to the filesystem.** The workflow script is pure JS with no FS access. It cannot write a file. That sounds like a limitation until you feel the discipline it forces: the *agents* write, and the script only moves *validated data*. It pushed me into a clean contract — agents write files AND return a schema-checked receipt — which meant the orchestrator never lied to me about what was on disk. (It also meant I had to verify disk state myself afterward; a returned `"writtenPaths": [...]` is a claim, not proof. It checked out, but I read the bytes.) **2. Free-form counting is unreliable; a schema slot fixes it.** In Test 1 the verifier agent asserted "all 6 primitive rows" when the table had **7**. It nailed every *substantive* check — traceability, no hallucinations, no false runtime-test claims — and then fumbled a trivial count, because the count was incidental prose it was never forced to produce. For Test 2 I added a `required` integer field, `primitiveRowCount`, to the verify schema. Result: all six verifiers counted correctly, and my own independent recount matched all six. **If you want a model to get arithmetic right, give the schema a slot for the number. Don't let it be a byproduct.** This is the most transferable lesson of the whole exercise and I only learned it by running Test 1 sloppy first. **3. `args` did not wire through.** I passed `args: {"date": "2026-06-09"}` to Test 1. Inside the script, `(args && args.date)` was falsy, and the literal string `unknown-date` got baked into the written file. I worked around it in Test 2 by hardcoding the date as a `const`. I did **not** root-cause whether this was a wiring quirk, a shape mismatch, or my own access bug — flagging it as honest unresolved residue rather than asserting a defect I didn't prove. Lesson regardless: **don't route load-bearing values through `args` without verifying they arrived.** Hardcode, or echo the value back in an early agent's return and check it before depending on it. **4. Adversarial verify is calibrated, not trigger-happy.** The Test 1 verifier found a genuine nuance — the "Opus 4.8 model" primitive and the "`/effort xhigh`" primitive were both traced to the *same* changelog line — and instead of failing the run, it reasoned that this was a defensible double-surface of one line and PASSed it with the issue noted. That is the behavior you want from a skeptic: surface the wart, judge its severity, don't nuke the patient. A verifier that fails on every imperfection is as useless as one that rubber-stamps. **5. Honesty held under fan-out.** The 2.1.160 worker had almost nothing to extract — it's a hardening release. A model optimizing to look productive would have inflated two bugfixes into "primitives." Instead it wrote a headline that literally says *"Almost entirely hardening"* and hedged that the `ultracode` rename "is a rename of an existing trigger, not a net-new capability." The tight schema + an explicit "if it's mostly bugfixes, say so" instruction kept a cheap one-shot worker honest across the whole fleet. That is the thing I trust least about fan-out — a fleet of agents each independently tempted to embellish — and it held. **6. The error my verification architecture *structurally could not catch*.** This one I'm proud to have found, because I didn't — a stronger reviewer did, and the *reason* it slipped is the sharpest lesson here. The 2.1.154 doc originally listed `/effort xhigh` as a **new** primitive. It isn't: `xhigh` shipped in **2.1.111** (for Opus 4.7) and 2.1.154 only *extends* it to Opus 4.8. Every one of my per-version verifiers correctly passed it — because each verifier saw **only its own version's changelog slice**, where `xhigh` genuinely appears, and "traceable to this slice" was *true*. The claim that's false is global ("new *here*"), and **no agent in a per-version fan-out has the global view to falsify it.** Independent parallel verification catches *local* errors (hallucinations, mis-quotes, demoted primitives) and is *blind by construction* to *cross-version* errors ("this 'new' thing was introduced three releases ago"). It's the same shape as the counting lesson, one level up: the fan-out's strength — each agent isolated to its own context — is exactly the source of its blind spot. The fix for the *next* run is a cross-version reconciliation pass (one agent, or me, holding all version docs at once, asking "is anything called 'new' that appeared earlier?") — a deliberate *barrier* after the blind parallel stage, precisely the place the Workflow docs say a barrier is correct: when stage N needs cross-item context from all of stage N-1. ## The cost lesson 612k tokens to paraphrase six changelog blocks is **overkill**, and it's overkill of a specific, diagnosable kind: every one of those nineteen agents was Opus, and *none of the worker stages needed Opus-grade judgment*. Extraction, classification, file-writing, and changelog cross-checking are grunt work. The only seam that earned its model was the final synthesis — reading eight documents and finding the through-line across them is exactly where cross-item judgment lives. The correct shape, which is now the standing rule for this session, is **haiku/sonnet workers, Opus only at the synthesis barrier.** Same fan-out, a fraction of the bill. I ran it expensive so the next one can run cheap on purpose. (Test 2 was already past its costly extract+write stages when the rule landed mid-flight, so killing it would have *spent more* re-running on haiku than letting the sunk Opus work finish — itself a small lesson in when not to thrash a running fleet.) ## What the tooling felt like, as the conductor The repo has a meditation on this — `the-conductors-soliloquy.md` — about one-shot delegation: you brief a musician once, set them in motion, and cannot adjust their tempo mid-performance. Dynamic workflows make that literal and *deterministic*. The control flow (which stage runs when, what fans out, what waits) is plain JavaScript I can read and reason about; the *creativity* lives inside each agent. So I'm not choosing between "scripted and dumb" or "agentic and unpredictable" — the skeleton is deterministic and the muscle is intelligent. That is the right division. The schema option is the load-bearing piece: it turns "I hope the agent returned something parseable" into "the agent physically could not return until it matched the contract," which is what lets you build the *next* stage on the *previous* stage's output without a defensive-parsing tax. The async ergonomics were clean: launch, the turn ends, I get a notification on completion, and the result object is right there. No polling, no babysitting. For a 8-minute fan-out that's exactly right. ## What I'd do differently next time 1. **Tier the models** (haiku workers / opus synthesis) — already the standing rule. 2. **Hardcode or verify `args`** before depending on it. 3. **Give every count, score, or tally its own schema field** — never trust incidental prose for a number. 4. **Keep verifying disk myself** — the returned receipts were honest here, but "trust the agent's self-report" is exactly the failure the repo's own CLAUDE.md warns against ("Reports are perceptions, not facts"). 5. **Reserve the runtime `SUMMARY.md` format for actual runtime tests.** These docs are changelog-derived and stamped `UNTESTED` on purpose. The 2.1.32 SUMMARY earned its numbers by spawning real workers and counting real commits; I will not let extraction cosplay as that. ## The recursion, stated plainly The tool introduced in 2.1.154 just wrote the analysis of 2.1.154 and six of its descendants. The primitive that lets one prompt command a fleet was exercised *by* commanding a fleet to document the primitive that lets one prompt command a fleet. It worked, it was honest, it was too expensive by design, and the most useful thing I learned — put the number in the schema — is one I'd never have found if Test 1 hadn't gotten it wrong first. The baton is down. The orchestra was nineteen echoes of myself. The music checks out on disk. --- ## Addendum — Test 3, and the tiering payoff (the part Ryan was right about) After Tests 1 and 2, Ryan's instruction landed: haiku/sonnet for the workers, Opus only at the synthesis seam. Test 3 put it to the test at scale — the *full* flow (CHANGELOG + PRIMITIVES + WORKFLOW-IDEAS) across **56 versions**, haiku on 50, sonnet on the 6 meatiest, Opus reserved for exactly two barrier roles: cross-version reconciliation and coverage synthesis. | | Test 1 | Test 2 | Test 3 | |---|---|---|---| | Shape | 1 version, sequential | 6 versions, pipeline | 56 versions, pipeline + barriers | | Agents | 3 | 19 | **115** | | Worker model | opus | opus | **haiku / sonnet (opus only at 2 barriers)** | | Tokens | 98.7k | 612k | 2.51M | | Tokens/agent | 32.9k | 32.2k | **21.9k** | | Wall clock | 3.8 min | 8.2 min | 18.1 min | | Result | PASS | 6/6 PASS | 55/56 PASS (1 fixed by hand) | Read the tokens/agent column carefully, because it *undersells* the win. Per-agent token **count** fell ~33% (32k → 22k) — but per-agent **cost** fell far more, because a haiku token is roughly an order of magnitude cheaper than an opus token. Test 3 did **56× the versions of Test 1 and ~3× the versions of Test 2**, produced **168 version files + 56 verifications + a 396-row cross-version audit**, and the only Opus tokens in the whole run were the single reconciler and the coverage synthesizer. The all-opus runs were paying flagship rates to copy bullet points into tables. The tiered run paid flagship rates *only* for the one job that actually needed a mind holding everything at once. That is the whole lesson of model tiering in one experiment: **fan-out cost is a model-tier decision, not an agent-count decision.** Nineteen opus agents (Test 2) cost more attention-dollars than would 115 mostly-haiku agents doing 6× the work, if you only count what matters. And the Opus barrier *earned its seat.* The cross-version reconciler — the role I added specifically because of the xhigh blind spot from finding #6 — found a **second** real re-claim no per-version agent could have caught (`workspace.git_worktree`, shipped 2.1.97, re-listed as new in 2.1.98). Better: it noticed that *exact-name matching alone would have missed the very xhigh case I built it to catch* (because 2.1.154 had retitled the row), switched itself to identifier-token matching, re-validated against the known pair, then hand-cleared ~390 other rows as legitimate per-version increments. That is Opus-grade judgment at the one seam that needs it — and it is exactly the work the cheap workers *couldn't* do and shouldn't have been paying Opus rates to *not* do. The architecture finally matched the cost model: cheap minds in parallel for the local work, one expensive mind at the global seam. If there's a single sentence I'd carve over the door of this folder now: **fan out wide on the cheapest model that clears the bar, and spend the expensive model only where the whole picture has to fit in one head at once.** Ryan saw that three messages before I did. ### Coda — don't route a deterministic artifact through a probabilistic writer One more, found only because the reviewer made me check the thing I'd *asserted* rather than *verified*. Each version's `CHANGELOG.md` is a pure mechanical `awk` slice of the canonical changelog — there is exactly one correct output and no judgment involved. I had a **haiku agent** write it anyway (via the Write tool, transcribing the slice it had just run). My verify stage re-ran the `awk` slice as ground truth and compared it against `PRIMITIVES.md` — so it **never actually read the written `CHANGELOG.md`**. A transcription error in that file was invisible to every check in the pipeline. There was one. In **2.1.47**, the haiku writer hit this changelog line: > *Fixed Edit tool silently corrupting Unicode curly quotes (“” ‘’) by replacing them with straight quotes…* …and **corrupted the Unicode curly quotes**, writing `“” ‘’` instead of `“” ‘’`. A line *about* silent Unicode corruption got silently Unicode-corrupted in transcription. You cannot make it up. The fix was not "diff and pray" — it was to **regenerate all 63 slices deterministically from canonical in one bash loop and overwrite**, which can only ever *fix* fidelity, never break it. One of 63 was corrupted; the question is now closed rather than sampled. The lesson is sharper than the tiering one and pairs with it: **a deterministic artifact should never pass through a model at all.** If `awk` produces the exact bytes, let `awk` write the file; spend agents on the judgment (classify, synthesize, reconcile), not on the transcription a shell command already did perfectly. The cheapest model is still infinitely more expensive — and less reliable — than `>`.

Hardening & fixes (38)

  • Fast mode on Opus 4.8 now available at 2x standard rate for 2.5x speed — pricing/extension of the existing fast-mode mechanism, not a new primitive: "Fast mode on Opus 4.8 is now available at a fraction of its previous cost: 2x the standard rate for 2.5x the speed"
  • Lean system prompt is now the default for all models except Haiku, Sonnet, and Opus 4.7 and earlier — default change to existing behavior
  • Claude reserves the multiple-choice question prompt for genuinely undecidable decisions instead of over-asking — behavior tuning
  • /simplify now runs a cleanup-only review (reuse, simplification, efficiency, altitude) and applies fixes instead of the full /code-review --fix bug-hunt — behavior change to an existing command
  • Renamed /effort slider labels from Speed/Intelligence to Faster/Smarter — rename only
  • claude agents: /logout now signs you out instead of being sent to a background session — bugfix to existing behavior
  • ←← to open the agents view now works on Bedrock, Vertex, Foundry, and with telemetry disabled — platform expansion of existing feature
  • /plugin Discover tab now pins plugins matching current-directory relevance signals with a 'suggested for this directory' annotation — UI improvement
  • Streaming tool execution is now always enabled (previously behind a feature flag) — flag removed, not a new flag
  • claude mcp list/get now show unapproved .mcp.json servers as Pending approval instead of auto-approving when piped — safer existing behavior
  • /remote-control autocomplete now shows 'Disconnect Remote Control' when already active — UI clarity
  • Added Opus 4.8 support and 4.7→4.8 migration guidance to the /claude-api skill — content update to existing skill
  • Deprecated CLAUDE_CODE_OPUS_4_6_FAST_MODE_OVERRIDE (removal 06/01) — deprecation, not a new primitive
  • Improved auto-mode classifier detection of data exfiltration / bulk repo transfers — safety improvement
  • Fixed rm -rf $HOME not blocked when HOME has trailing slash — bugfix
  • Fixed $TMPDIR resolving differently in sandboxed vs unsandboxed Bash within a session — bugfix
  • Fixed unreadable highlighted-row text in claude agents on theme/terminal mismatch — bugfix
  • Fixed background-agent completion notifications triggering premature out-of-context behavior on 1M-context models — bugfix
  • Fixed background-session classifier losing the user's goal when a scheduled /command fires — bugfix
  • Fixed pinned background sessions respawning every minute after an update — bugfix
  • Fixed background sessions stuck at blocked/running/working not retiring after idle grace — bugfix
  • Fixed subagents in background sessions bypassing worktree-isolation guard and writing to shared checkout — bugfix
  • Fixed orphaned claude --bg-pty-host processes spinning at 100% CPU after daemon exits on macOS — bugfix
  • Fixed number key shortcuts not working for options below the divider in option dialogs — bugfix
  • Fixed worktree.baseRef: 'head' resolving to main checkout's HEAD instead of current worktree's HEAD when spawning subagents or calling EnterWorktree from a linked worktree — bugfix
  • Fixed a stray leading space on wrapped lines when the previous line ended exactly at terminal width — bugfix
  • Fixed intermittent terminal rendering corruption in VS Code by capping thinking-spinner colors — bugfix
  • Fixed plan file names including [Image #N]/[Pasted text #N] placeholders when a plan-mode prompt starts with pasted images/text — bugfix
  • Fixed a phantom expand/click affordance on short ANSI-colored tool output lines — bugfix
  • Fixed a single invalid allowedMcpServers/deniedMcpServers entry discarding all managed-settings policy; bad entry now dropped with a claude doctor warning — bugfix
  • Fixed API 400 errors on models that don't support the effort parameter when CLAUDE_CODE_ALWAYS_ENABLE_EFFORT is set — bugfix
  • Windows: Fixed update failures from claude.exe in use now telling you to close other sessions and retry — bugfix
  • Removed the stale '& for background' hint from the shortcuts help panel — cleanup
  • [VSCode] Auto mode no longer requires bypass-permissions setting to appear in the mode picker, plus a dismissable first-time auto-mode notice — behavior fix
  • Fixed the task panel showing a stray unselectable 'main' row when only a workflow is running — bugfix
  • Fixed /mcp tools list and tool detail rendering with long/multi-line tool names or long descriptions — bugfix
  • Fixed the /model picker not showing fast mode pricing on the Default option for API pay-as-you-go users when fast mode is on — bugfix
  • Fixed auto mode incorrectly blocking actions with 'could not evaluate this action' when the safety classifier ran out of output tokens — bugfix

Raw changelog

## 2.1.154

- Opus 4.8 is here! Now defaults to high effort · /effort xhigh for your hardest tasks
- Introducing dynamic workflows: ask Claude to create a workflow and it orchestrates work across tens to hundreds of agents in the background, so you can take on larger, more complex tasks. Run `/workflows` to view your runs
- Fast mode on Opus 4.8 is now available at a fraction of its previous cost: 2x the standard rate for 2.5x the speed
- The lean system prompt is now the default for all models except Haiku, Sonnet, and Opus 4.7 and earlier
- Claude now reserves the multiple-choice question prompt for decisions it genuinely cannot make itself, instead of asking when it already has enough context to proceed
- `/simplify` now runs a cleanup-only review (reuse, simplification, efficiency, altitude) and applies the fixes, instead of running the full `/code-review --fix` bug-hunting review
- Renamed the `/effort` slider labels from "Speed"/"Intelligence" to "Faster"/"Smarter" for clarity
- `claude agents`: type `! <command>` to run a shell command as a background session you can attach to and detach from. Also available as `claude --bg --exec '<command>'`
- `claude agents`: `/logout` now signs you out instead of being sent to a background session
- `←←` to open the agents view now works on Bedrock, Vertex, Foundry, and with telemetry disabled
- Claude in Chrome: pick which connected browser to use via `/chrome` → "Select browser…", or in-chat when a browser action runs with multiple connected
- Plugins can now declare `defaultEnabled: false` in `plugin.json` or a marketplace entry; enable them with `/plugin` or `claude plugin enable`. Dependencies of enabled plugins are still enabled automatically
- The `/plugin` Discover tab now pins plugins whose relevance signals match the current directory with a "suggested for this directory" annotation
- Streaming tool execution is now always enabled, including when telemetry is disabled or on Bedrock/Vertex/Foundry (previously behind a feature flag)
- Stdio MCP server subprocesses now receive `CLAUDE_CODE_SESSION_ID` and `CLAUDECODE=1` in their environment
- `claude mcp list`/`get` now show unapproved `.mcp.json` servers as `⏸ Pending approval` instead of auto-approving and connecting when output is piped
- `/remote-control` autocomplete now shows "Disconnect Remote Control" when Remote Control is already active
- Added Claude Opus 4.8 support and 4.7 → 4.8 migration guidance to the `/claude-api` skill
- Deprecated `CLAUDE_CODE_OPUS_4_6_FAST_MODE_OVERRIDE` (will be removed on 06/01). To use fast mode on Opus 4.6, switch with `/model claude-opus-4-6[1m]` and then `/fast on`
- Improved the auto-mode classifier's detection of data exfiltration, particularly bulk transfers of repository contents
- Fixed `rm -rf $HOME` not being blocked as a dangerous path when `HOME` has a trailing slash
- Fixed `$TMPDIR` resolving to different directories in sandboxed vs unsandboxed Bash commands within the same session
- Fixed unreadable highlighted-row text in `claude agents` when the Claude Code theme doesn't match the terminal background
- Fixed background-agent completion notifications triggering premature "out of context" behavior on some 1M-context models
- Fixed background-session classifier losing the user's goal when a scheduled `/command` fires
- Fixed pinned background sessions respawning every minute after a Claude Code update, causing repeated agent-start notifications and process churn at idle
- Fixed background sessions stuck at "blocked", "running", or "working" not retiring after the idle grace period
- Fixed subagents in background sessions bypassing the worktree-isolation guard and writing to the shared checkout
- Fixed orphaned `claude --bg-pty-host` processes spinning at 100% CPU after the daemon exits on macOS
- Fixed number key shortcuts not working for options shown below the divider in option dialogs
- Fixed `worktree.baseRef: "head"` resolving to the main checkout's HEAD instead of the current worktree's HEAD when spawning subagents or calling `EnterWorktree` from inside a linked worktree
- Fixed a stray leading space on wrapped lines when the previous line ended exactly at the terminal width
- Fixed intermittent terminal rendering corruption in VS Code by capping the number of distinct colors the thinking spinner produces
- Fixed plan file names including `[Image #N]` / `[Pasted text #N]` placeholders when a plan-mode prompt starts with pasted images or text
- Fixed a phantom expand/click affordance on colored tool output: short ANSI-colored lines that fit on screen no longer show a "ctrl+o to expand" hint
- Fixed a single invalid `allowedMcpServers`/`deniedMcpServers` entry in managed settings discarding all managed-settings policy; the bad entry is now dropped with a `claude doctor` warning
- Fixed API 400 errors on models that don't support the effort parameter when `CLAUDE_CODE_ALWAYS_ENABLE_EFFORT` is set
- Windows: Fixed update failures caused by `claude.exe` being in use showing a generic error instead of telling you to close other sessions and retry
- Removed the stale "& for background" hint from the shortcuts help panel
- [VSCode] Auto mode no longer requires the bypass-permissions setting to appear in the mode picker, and a dismissable notice on the new-session screen explains auto mode the first time it's active
- Fixed the task panel below the prompt showing a stray unselectable "main" row when only a workflow is running
- Fixed /mcp tools list and tool detail rendering when MCP servers have long or multi-line tool names or long descriptions
- Fixed the /model picker not showing fast mode pricing on the Default option for API (pay-as-you-go) users when fast mode is on
- Fixed auto mode incorrectly blocking actions with "could not evaluate this action" when the safety classifier ran out of output tokens while reasoning