- Notifications
You must be signed in to change notification settings - Fork 1.5k
Add prompt caching support for AWS Bedrock #3438
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add prompt caching support for AWS Bedrock #3438
Conversation
5263d8a to 6612939 Compare | @DenysMoskalenko Thanks for working on this Denys!
Agreed. Can you please have a look at these issues and address them in case they affect this implementation as well?
|
| Sure:
|
783607c to 900d542 Compare | @DouweM The limitation is: |
@DenysMoskalenko If the number 4 changes or becomes model-specific we can add it to the model profile. But I do think we should take care of staying under the limit, because it's not so easy for the user to do so themself if there are In this case, "implicit magic" is a bit intentional, because as I wrote on #3453, the goal is for this to be useful to people who don't want to become experts on prompt caching and the limitations Anthropic enforces, not the more advanced users and use cases that need fine-grained control. In any case, the
Usually with model settings, we silently ignore them if they're not supported (that's why most of them say "Supported by: ..." in the docstring), so I might prefer to say "Supported by: Anthropic on Bedrock", and then silently ignore it for Nova. I agree raising errors when the user does something unsupported is usually good, but with model settings we typically do a "best effort" so that as many requests as possible succeed. |
| This PR is stale, and will be closed in 3 days if no reply is received. |
Did not see that #3442 was merged. I will continue work here |
79a3817 to a887764 Compare | @DouweM Could you pls take a look? |
a887764 to 2f06c77 Compare | @DenysMoskalenko Can you merge main and fix the conflict please? |
2f06c77 to cab0176 Compare - Emit cache-point tool entries so Bedrock accepts cached tool definitions - Document and test prompt caching (writes + reads) with cassette-body checks - Refresh Bedrock cassettes and type annotations to align with the new flow
Add convenience setting to cache last user message and implement _limit_cache_points to automatically strip excess cache points when exceeding Bedrock's 4-point maximum.
- Extend `BedrockModelProfile` with `bedrock_supports_prompt_caching` and `bedrock_supports_tool_caching`. - Update caching logic to conditionally add cache points based on model support. - Add tests to verify skipping cache points for unsupported models and Nova's tool caching limitations. - Refine `_map_user_prompt` to handle prompt caching settings. - Adjust documentation to clarify Bedrock's minimum token threshold for caching.
cab0176 to 2ffc709 Compare | @DenysMoskalenko Thanks Denys! |
Summary
Implements AWS Bedrock prompt caching support (closes #3418) by fixing how cache points are sent, documenting the workflow, and extending test coverage to assert cache writes and reads.
Testing