feat(anthropic): add prompt caching support to direct Anthropic API by chosw1029 · Pull Request #2160 · strands-agents/sdk-python

chosw1029 · 2026-04-19T12:22:57Z

Description

AnthropicModel currently provides no way to take advantage of Anthropic's prompt caching feature, which BedrockModel already supports. Three gaps:

stream() drops system_prompt_content. The event loop (event_loop/streaming.py) passes both system_prompt and system_prompt_content to the model. BedrockModel uses the latter, but AnthropicModel.stream() only declares system_prompt: str | None and silently swallows system_prompt_content via **kwargs.
format_request() sends system as a plain string. The resulting request body is {"system": "<string>", ...}, so there is no place to attach cache_control.
Cache token counts are dropped. In format_chunk() the metadata case only extracts input_tokens / output_tokens. Anthropic already returns cache_creation_input_tokens and cache_read_input_tokens in the usage object, but they never reach downstream consumers.

This PR closes the three gaps:

stream() and format_request() now accept system_prompt_content: list[SystemContentBlock] | None.
When system_prompt_content is supplied, the system field is emitted in Anthropic list-form. A cachePoint block attaches cache_control: {"type": "ephemeral"} to the preceding text block, mirroring the convention already used by _format_request_messages.
format_chunk() emits cacheReadInputTokens / cacheWriteInputTokens in the metadata usage dict. Names match BedrockModel's, so existing observability code (spans that read cacheReadInputTokens) works without changes. The existing Usage TypedDict already declares both fields as optional.

No beta headers are required — prompt caching is GA on the direct Anthropic API for Claude Sonnet 3.7+, Sonnet 4.x, Opus 4.x, and Haiku 4.x. The minimum cacheable prefix is enforced by the API, not the SDK.

Backwards compatibility

stream() / format_request() add a keyword argument with a default of None; existing callers are unaffected.
When system_prompt_content is absent, behavior is byte-for-byte identical (system is still sent as a plain string).
format_chunk() only adds cache keys to the metadata usage dict when the upstream response reports non-zero cache counts; otherwise the shape is unchanged.
No change to AnthropicConfig TypedDict — caching is driven by system_prompt_content, which is already produced by Agent when the caller passes a list[SystemContentBlock] system prompt.

Related Issues

Relates to #1140 (Prompt caching support for all models — currently ""ready for contribution"") and #1432 (cache_strategy=""auto"" across providers; the AnthropicModel path was listed but never implemented).

Documentation PR

Will follow up with a docs PR on strands-agents/agents-docs once the approach is confirmed. Happy to include it in this PR if preferred.

Type of Change

New feature

Testing

Unit tests added in tests/strands/models/test_anthropic.py:

system_prompt_content with a cachePoint block → request system is list-form and the preceding text block carries cache_control: {""type"": ""ephemeral""}.
system_prompt_content with text blocks only (no cachePoint) → list-form system, no cache_control.
system_prompt_content takes precedence over system_prompt when both are supplied.
Anthropic response metadata with cache_read_input_tokens / cache_creation_input_tokens → format_chunk exposes cacheReadInputTokens / cacheWriteInputTokens.
Anthropic response with zero cache counts → metadata usage dict unchanged.

Manually verified with a live Claude Sonnet 4.5 call: a second request with an identical system prefix reports cache_read_input_tokens > 0 where the pre-patch code reported 0.

Verify that the changes do not break functionality or introduce warnings in consuming repositories: agents-docs, agents-tools, agents-cli.

I ran hatch run prepare

Checklist

I have read the CONTRIBUTING document
I have added any necessary tests that prove my fix is effective or my feature works
I have updated the documentation accordingly
I have added an appropriate example to the documentation to outline the feature, or no new docs are needed
My changes generate no new warnings
Any dependent changes have been merged and published

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

- accept system_prompt_content in stream() / format_request() and emit Anthropic list-form system with cache_control on the block preceding a cachePoint - surface cacheReadInputTokens and cacheWriteInputTokens in metadata usage events, matching BedrockModel field names - add unit tests covering translation precedence and metadata extraction Relates to strands-agents#1140, strands-agents#1432

github-actions Bot added the size/m label Apr 19, 2026

chosw1029 requested a deployment to manual-approval April 19, 2026 12:23 — with GitHub Actions Waiting

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(anthropic): add prompt caching support to direct Anthropic API#2160

feat(anthropic): add prompt caching support to direct Anthropic API#2160
chosw1029 wants to merge 1 commit intostrands-agents:mainfrom
chosw1029:feat/anthropic-prompt-caching

chosw1029 commented Apr 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

chosw1029 commented Apr 19, 2026

Description

Backwards compatibility

Related Issues

Documentation PR

Type of Change

Testing

Checklist

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant