Claude Code 2.1.166
Fallback model routing and fine-grained thinking control unlock graceful degradation (load-shedding to 3 alternate models) and explicit thinking suppression (including on Claude 3.7 and future default-thinking models). Together with cross-session messaging hardening, this strengthens agent resilience and sandboxing in multi-session orchestration.
New primitives
fallbackModel settingsettingConfigure up to three fallback models tried in order when primary model is overloaded or unavailable
--fallback-model flagflagApply fallback model routing to interactive sessions (not just background agents)
Glob pattern deny rulescapabilityUse `"*"` in deny rule tool-name position to deny all tools; allow rules reject non-MCP globs
MAX_THINKING_TOKENS=0env-varDisable thinking on models that think by default via the Claude API
--thinking disabledflagDisable thinking on models with default thinking enabled
Per-model thinking togglesettingConfigure thinking per model instead of globally
claude update version announcementcapability`claude update` announces target version before downloading
claude agents URL filteringcapabilityType a URL into the agents list to filter to the session whose first prompt contained it
Workflow recipes
Agent orchestrators can now seamlessly shift work to secondary models when the primary saturates. Combined with automatic retry-on-fallback, this means background agents no longer block or fail on load; instead they downgrade gracefully and complete reliably.
Multi-layer agent hierarchy (coordinator → workers) specifies fallbackModel: [claude-opus-4-8, claude-sonnet-4-5, claude-haiku-4-5] in settings. When a worker tries the primary and hits rate-limit or overload, Claude Code retries on Opus 4.8; if Opus is also saturated, it cascades to Sonnet, then Haiku. The coordinator watches the fallback pattern (via logs or agent-reported model) and can load-balance future requests to secondary models proactively. Enables burst capacity handling without user intervention.
Agents can now trade thinking cost for speed by explicitly disabling thinking on default-thinking models (e.g., 3.7+), reclaiming tokens for actual tool invocation. Per-model toggles let agents mix thinkers (reasoning-heavy tasks) with non-thinkers (fast execution tasks) in the same session.
A multi-stage agent pipeline sets MAX_THINKING_TOKENS=0 for fast worker agents (input parsing, tool dispatch) and enables thinking only for the final reasoning stage (synthesis, risk assessment). Within a single session, the agent script can override per-model: agent-reason.sh runs with thinking enabled; agent-worker.sh runs with thinking disabled. This reclaims ~100K tokens per worker invocation for deeper tool interaction, enabling more complex orchestration without hitting token budgets on routine work.
Multi-agent systems that relay requests across sessions can now safely separate trust boundaries. A coordinator session cannot accidentally grant a worker session its permissions; each session retains isolated authority.
Coordinator A (high-trust, full MCP/bash access) spawns worker B and C for untrusted third-party tasks. When A sends a request to B via SendMessage, B cannot inherit A's MCP permissions or bash access. If B is compromised or acts maliciously, it cannot leak A's tools to its own MCP servers. This unblocks safe delegation patterns for red-teaming, sandboxed experiments, and customer-facing agent services where permission isolation is required.
Agentic relevance
Fallback model routing enables multi-session agent orchestration to degrade gracefully when the primary model saturates; the orchestrator can seamlessly shift work to secondaries without interruption. Cross-session messaging hardening prevents relay attacks where one session's permissions leak into another's; this is critical for safe agent-of-agents patterns. Thinking control (`MAX_THINKING_TOKENS=0`, per-model toggles) lets agents on compute-constrained paths suppress default-thinking models and reuse the token budget for tool interaction.
Hardening & fixes (16)
- Fixed a recurring "image could not be processed" error and extra token usage when an unprocessable image was sent in a session
- Fixed remote sessions becoming permanently stuck when a brief backend disruption occurred during worker registration at startup
- Fixed flickering in JetBrains IDE terminals (IntelliJ, PyCharm, WebStorm, etc.) on 2026.1+ by enabling synchronized output
- Fixed Shift+non-ASCII characters (e.g. Shift+ä → Ä) being dropped in terminals using the Kitty keyboard protocol (WezTerm, Ghostty, kitty)
- Fixed PowerShell command validation occasionally hanging far past its time budget on Windows when a killed process's children held its output pipes
- Fixed orphaned `claude --bg-pty-host` processes spinning at 100% CPU after the daemon dies while connected on macOS
- Fixed voice mode requiring `/login` to clear a stale auth check after toggling `/voice`
- Fixed managed settings with an invalid entry silently disabling enforcement of their remaining valid policies
- Fixed managed-settings `allowedMcpServers`/`deniedMcpServers` predicates not matching when they use `${VAR}` references
- Fixed background agent sessions that entered a git worktree crash-looping with "No conversation found" when reopened from `claude agents`
- Fixed duplicated thinking text in the Ctrl+O transcript view while streaming
- Fixed `/doctor` showing a contradictory failed "Not inside a remote session" check when run inside a remote session
- Fixed the cursor sticking at the end of the first line when typing a multiline prompt in the `claude agents` dispatch and reply inputs
- Fixed blank lines appearing between background agent rows in the task list on terminals without Unicode support
- Hardened cross-session messaging: messages relayed via `SendMessage` from other Claude sessions no longer carry user authority — receivers refuse relayed permission requests, and auto mode blocks them
- Claude Code now retries a turn once on the fallback model when the API rejects an unexpected non-retryable error; auth, rate-limit, request-size, and transport errors still surface immediately
Raw changelog
## 2.1.166
- Added `fallbackModel` setting to configure up to three fallback models tried in order when the primary model is overloaded or unavailable; `--fallback-model` now also applies to interactive sessions
- Added glob pattern support in deny rule tool-name position (`"*"` denies all tools); allow rules reject non-MCP globs, and unknown tool names in deny rules warn at startup
- Hardened cross-session messaging: messages relayed via `SendMessage` from other Claude sessions no longer carry user authority — receivers refuse relayed permission requests, and auto mode blocks them
- `MAX_THINKING_TOKENS=0`, `--thinking disabled`, and the per-model thinking toggle now disable thinking on models that think by default via the Claude API (3P providers unchanged)
- Claude Code now retries a turn once on the fallback model when the API rejects an unexpected non-retryable error; auth, rate-limit, request-size, and transport errors still surface immediately
- `claude update` now announces the target version before downloading instead of going silent
- `claude agents`: typing a URL into the list now filters to the session whose first prompt contained it
- Fixed a recurring "image could not be processed" error and extra token usage when an unprocessable image was sent in a session
- Fixed remote sessions becoming permanently stuck when a brief backend disruption occurred during worker registration at startup
- Fixed flickering in JetBrains IDE terminals (IntelliJ, PyCharm, WebStorm, etc.) on 2026.1+ by enabling synchronized output
- Fixed Shift+non-ASCII characters (e.g. Shift+ä → Ä) being dropped in terminals using the Kitty keyboard protocol (WezTerm, Ghostty, kitty)
- Fixed PowerShell command validation occasionally hanging far past its time budget on Windows when a killed process's children held its output pipes
- Fixed orphaned `claude --bg-pty-host` processes spinning at 100% CPU after the daemon dies while connected on macOS
- Fixed voice mode requiring `/login` to clear a stale auth check after toggling `/voice`
- Fixed managed settings with an invalid entry silently disabling enforcement of their remaining valid policies
- Fixed managed-settings `allowedMcpServers`/`deniedMcpServers` predicates not matching when they use `${VAR}` references
- Fixed background agent sessions that entered a git worktree crash-looping with "No conversation found" when reopened from `claude agents`
- Fixed duplicated thinking text in the Ctrl+O transcript view while streaming
- Fixed `/doctor` showing a contradictory failed "Not inside a remote session" check when run inside a remote session
- Fixed the cursor sticking at the end of the first line when typing a multiline prompt in the `claude agents` dispatch and reply inputs
- Fixed blank lines appearing between background agent rows in the task list on terminals without Unicode support