Skip to content

feat(anthropic): delegate cache_control kwarg to anthropic top-level param#35967

Merged
ccurme (ccurme) merged 2 commits intomasterfrom
cc/anthropic_automatic_caching
Mar 17, 2026
Merged

feat(anthropic): delegate cache_control kwarg to anthropic top-level param#35967
ccurme (ccurme) merged 2 commits intomasterfrom
cc/anthropic_automatic_caching

Conversation

@ccurme
Copy link
Copy Markdown
Collaborator

Removes handling introduced in #31523, as Anthropic now supports passing in cache_control=... top-level.

@github-actions github-actions Bot added integration PR made that is related to a provider partner package integration dependencies Pull requests that update a dependency file (e.g. `pyproject.toml` or `uv.lock`) anthropic `langchain-anthropic` package issues & PRs size: M 200-499 LOC internal feature For PRs that implement a new feature; NOT A FEATURE REQUEST labels Mar 16, 2026
@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented Mar 16, 2026

Merging this PR will not alter performance

✅ 3 untouched benchmarks
⏩ 33 skipped benchmarks1


Comparing cc/anthropic_automatic_caching (3565c52) with master (69a7b9c)

Open in CodSpeed

Footnotes

  1. 33 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@ccurme ccurme (ccurme) merged commit 55711b0 into master Mar 17, 2026
66 checks passed
@ccurme ccurme (ccurme) deleted the cc/anthropic_automatic_caching branch March 17, 2026 14:49
Mason Daugherty (mdrxy) added a commit that referenced this pull request Apr 28, 2026
)

Closes #37042

---

`AnthropicPromptCachingMiddleware` was unconditionally setting top-level
`cache_control` in `model_settings` for any `ChatAnthropic` subclass.
That field is direct-Anthropic-API only — `ChatAnthropicBedrock` (which
subclasses `ChatAnthropic` and passed the existing `isinstance` gate)
errored with `cache_control: Extra inputs are not permitted`.
Investigating that surfaced a related regression: PR #35967 also deleted
the block-level `cache_control` injection in `_get_request_payload`,
which silently disabled caching entirely for non-direct subclasses
(Bedrock had been falling back to in-block breakpoints). This restores
both paths.

## Changes
- Add `_is_direct_anthropic_llm_type` predicate that allowlists
`_llm_type == "anthropic-chat"`. Both the middleware's
`_supports_automatic_caching` and the new branch in
`ChatAnthropic._get_request_payload` route through it, so any subclass
that overrides `_llm_type` (Bedrock today, future direct-API variants
tomorrow) is treated as non-direct by default. Replaces the prior
substring-matching denylist on `"bedrock"`/`"vertex"`.
- Restore `_collect_code_execution_tool_ids`,
`_is_code_execution_related_block`, and a new
`_apply_cache_control_to_last_eligible_block` helper in `chat_models`.
For non-direct subclasses, `_get_request_payload` now pops
`cache_control` from kwargs and walks messages newest-to-oldest,
attaching the breakpoint to the last block that isn't
`code_execution`-related (Anthropic forbids breakpoints on those).
- Emit `UserWarning` when `cache_control` is requested but every
candidate block is `code_execution`-related — previously a silent drop.
- `AnthropicPromptCachingMiddleware._apply_caching` now sets the
top-level `cache_control` only when
`_supports_automatic_caching(request.model)`. System-message and
tool-definition breakpoints continue to apply for all `ChatAnthropic`
subclasses, since those are accepted by every transport.
- Note: `ChatAnthropicVertex` does not subclass `ChatAnthropic` (it
lives in `langchain-google-vertexai` and ships its own
`_get_request_payload`), so the chat-models changes here only affect
Bedrock. The middleware-side gate covers Vertex implicitly via the
`isinstance(request.model, ChatAnthropic)` check that already excludes
it.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

anthropic `langchain-anthropic` package issues & PRs dependencies Pull requests that update a dependency file (e.g. `pyproject.toml` or `uv.lock`) feature For PRs that implement a new feature; NOT A FEATURE REQUEST integration PR made that is related to a provider partner package integration internal size: M 200-499 LOC

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants