Reduce costs and latency by caching static or repetitive prompt content (like system prompts, tool definitions, and conversation history) on Anthropic’s servers. This middleware implements a conversational caching strategy that places cache breakpoints after the most recent message, allowing the entire conversation history (including the latest user message) to be cached and reused in subsequent API calls.Prompt caching is useful for the following:
Applications with long, static system prompts that don’t change between requests
Agents with many tool definitions that remain constant across invocations
Conversations where early message history is reused across multiple turns
High-volume deployments where reducing API costs and latency is critical
Behavior when using non-Anthropic models. Options: 'ignore', 'warn', or 'raise'
Full example
The middleware caches content up to and including the latest message in each request. On subsequent requests within the TTL window (5 minutes or 1 hour), previously seen content is retrieved from cache rather than reprocessed, significantly reducing costs and latency.How it works:
First request: System prompt, tools, and the user message “Hi, my name is Bob” are sent to the API and cached
Second request: The cached content (system prompt, tools, and first message) is retrieved from cache. Only the new message “What’s my name?” needs to be processed, plus the model’s response from the first request
This pattern continues for each turn, with each request reusing the cached conversation history
Prompt caching reduces API costs by caching tokens, but does not provide conversation memory. To persist conversation history across invocations, use a checkpointer like MemorySaver.
Copy
from langchain_anthropic import ChatAnthropicfrom langchain_anthropic.middleware import AnthropicPromptCachingMiddlewarefrom langchain.agents import create_agentfrom langchain.messages import HumanMessagefrom langchain_core.runnables import RunnableConfigfrom langgraph.checkpoint.memory import MemorySaverLONG_PROMPT = """Please be a helpful assistant.<Lots more context ...>"""agent = create_agent( model=ChatAnthropic(model="claude-sonnet-4-5-20250929"), system_prompt=LONG_PROMPT, middleware=[AnthropicPromptCachingMiddleware(ttl="5m")], checkpointer=MemorySaver(), # Persists conversation history)# Use a thread_id to maintain conversation stateconfig: RunnableConfig = {"configurable": {"thread_id": "user-123"}}# First invocation: Creates cache with system prompt, tools, and "Hi, my name is Bob"agent.invoke({"messages": [HumanMessage("Hi, my name is Bob")]}, config=config)# Second invocation: Reuses cached system prompt, tools, and previous messages# The checkpointer maintains conversation history, so the agent remembers "Bob"result = agent.invoke({"messages": [HumanMessage("What's my name?")]}, config=config)print(result["messages"][-1].content)
Copy
Your name is Bob! You told me that when you introduced yourself at the start of our conversation.
import tempfilefrom langchain_anthropic import ChatAnthropicfrom langchain_anthropic.middleware import ClaudeBashToolMiddlewarefrom langchain.agents import create_agentfrom langchain.agents.middleware import DockerExecutionPolicy# Create a temporary workspace directory for this demo.# In production, use a persistent directory path.workspace = tempfile.mkdtemp(prefix="agent-workspace-")agent = create_agent( model=ChatAnthropic(model="claude-sonnet-4-5-20250929"), tools=[], middleware=[ ClaudeBashToolMiddleware( workspace_root=workspace, startup_commands=["echo 'Session initialized'"], execution_policy=DockerExecutionPolicy( image="python:3.11-slim", ), ), ], )# Claude can now use its native bash toolresult = agent.invoke( {"messages": [{"role": "user", "content": "What version of Python is installed?"}]})print(result["messages"][-1].content)
from langchain_anthropic import ChatAnthropicfrom langchain_anthropic.middleware import StateClaudeTextEditorMiddlewarefrom langchain.agents import create_agentfrom langchain_core.runnables import RunnableConfigfrom langgraph.checkpoint.memory import MemorySaveragent = create_agent( model=ChatAnthropic(model="claude-sonnet-4-5-20250929"), tools=[], middleware=[ StateClaudeTextEditorMiddleware( allowed_path_prefixes=["/project"], ), ], checkpointer=MemorySaver(),)# Use a thread_id to persist state across invocationsconfig: RunnableConfig = {"configurable": {"thread_id": "my-session"}}# Claude can now create and edit files (stored in LangGraph state)result = agent.invoke( {"messages": [{"role": "user", "content": "Create a file at /project/hello.py with a simple hello world program"}]}, config=config,)print(result["messages"][-1].content)
Copy
I've created a simple "Hello, World!" program at `/project/hello.py`. The program uses Python's `print()` function to display "Hello, World!" to the console when executed.
Full example: Filesystem-based text editor
Copy
import tempfilefrom langchain_anthropic import ChatAnthropicfrom langchain_anthropic.middleware import FilesystemClaudeTextEditorMiddlewarefrom langchain.agents import create_agent# Create a temporary workspace directory for this demo.# In production, use a persistent directory path.workspace = tempfile.mkdtemp(prefix="editor-workspace-")agent = create_agent( model=ChatAnthropic(model="claude-sonnet-4-5-20250929"), tools=[], middleware=[ FilesystemClaudeTextEditorMiddleware( root_path=workspace, allowed_prefixes=["/src"], max_file_size_mb=10, ), ],)# Claude can now create and edit files (stored on disk)result = agent.invoke( {"messages": [{"role": "user", "content": "Create a file at /src/hello.py with a simple hello world program"}]})print(result["messages"][-1].content)
Copy
I've created a simple "Hello, World!" program at `/src/hello.py`. The program uses Python's `print()` function to display "Hello, World!" to the console when executed.
Provide Claude’s memory tool (memory_20250818) for persistent agent memory across conversation turns.The memory middleware is useful for the following:
Long-running agent conversations
Maintaining context across interruptions
Task progress tracking
Persistent agent state management
Claude’s memory tool uses a /memories directory and automatically injects a system prompt encouraging the agent to check and update memory.
from langchain_anthropic import ChatAnthropicfrom langchain_anthropic.middleware import StateClaudeMemoryMiddlewarefrom langchain.agents import create_agentfrom langchain_core.runnables import RunnableConfigfrom langgraph.checkpoint.memory import MemorySaveragent = create_agent( model=ChatAnthropic(model="claude-sonnet-4-5-20250929"), tools=[], middleware=[StateClaudeMemoryMiddleware()], checkpointer=MemorySaver(),)# Use a thread_id to persist state across invocationsconfig: RunnableConfig = {"configurable": {"thread_id": "my-session"}}# Claude can now use memory to track progress (stored in LangGraph state)result = agent.invoke( {"messages": [{"role": "user", "content": "Remember that my favorite color is blue, then confirm what you stored."}]}, config=config,)print(result["messages"][-1].content)
Copy
Perfect! I've stored your favorite color as **blue** in my memory system. The information is saved in my user preferences file where I can access it in future conversations.
Full example: Filesystem-based memory
The agent will automatically:
Check /memories directory at start
Record progress and thoughts during execution
Update memory files as work progresses
Copy
import tempfilefrom langchain_anthropic import ChatAnthropicfrom langchain_anthropic.middleware import FilesystemClaudeMemoryMiddlewarefrom langchain.agents import create_agent# Create a temporary workspace directory for this demo.# In production, use a persistent directory path.workspace = tempfile.mkdtemp(prefix="memory-workspace-")agent = create_agent( model=ChatAnthropic(model="claude-sonnet-4-5-20250929"), tools=[], middleware=[ FilesystemClaudeMemoryMiddleware( root_path=workspace, ), ],)# Claude can now use memory to track progress (stored on disk)result = agent.invoke( {"messages": [{"role": "user", "content": "Remember that my favorite color is blue, then confirm what you stored."}]})print(result["messages"][-1].content)
Copy
Perfect! I've stored your favorite color as **blue** in my memory system. The information is saved in my user preferences file where I can access it in future conversations.
State key containing files to search. Use "text_editor_files" for text editor files or "memory_files" for memory files.
Full example: Search text editor files
The middleware adds Glob and Grep search tools that work with state-based files.
Copy
from langchain_anthropic import ChatAnthropicfrom langchain_anthropic.middleware import ( StateClaudeTextEditorMiddleware, StateFileSearchMiddleware,)from langchain.agents import create_agentfrom langchain.messages import HumanMessagefrom langchain_core.runnables import RunnableConfigfrom langgraph.checkpoint.memory import MemorySaveragent = create_agent( model=ChatAnthropic(model="claude-sonnet-4-5-20250929"), tools=[], middleware=[ StateClaudeTextEditorMiddleware(), StateFileSearchMiddleware(state_key="text_editor_files"), ], checkpointer=MemorySaver(),)# Use a thread_id to persist state across invocationsconfig: RunnableConfig = {"configurable": {"thread_id": "my-session"}}# First invocation: Create some files using the text editor toolresult = agent.invoke( {"messages": [HumanMessage("Create a Python project with main.py, utils/helpers.py, and tests/test_main.py")]}, config=config,)# The agent creates files, which are stored in stateprint("Files created:", list(result["text_editor_files"].keys()))# Second invocation: Search the files we just created# State is automatically persisted via the checkpointerresult = agent.invoke( {"messages": [HumanMessage("Find all Python files in the project")]}, config=config,)print(result["messages"][-1].content)
I found 5 Python files in the project:1. `/project/main.py` - Main application file2. `/project/utils/__init__.py` - Utils package initialization3. `/project/utils/helpers.py` - Helper utilities4. `/project/tests/__init__.py` - Tests package initialization5. `/project/tests/test_main.py` - Main test fileWould you like me to view the contents of any of these files?
Full example: Search memory files
Copy
from langchain_anthropic import ChatAnthropicfrom langchain_anthropic.middleware import ( StateClaudeMemoryMiddleware, StateFileSearchMiddleware,)from langchain.agents import create_agentfrom langchain.messages import HumanMessagefrom langchain_core.runnables import RunnableConfigfrom langgraph.checkpoint.memory import MemorySaveragent = create_agent( model=ChatAnthropic(model="claude-sonnet-4-5-20250929"), tools=[], middleware=[ StateClaudeMemoryMiddleware(), StateFileSearchMiddleware(state_key="memory_files"), ], checkpointer=MemorySaver(),)# Use a thread_id to persist state across invocationsconfig: RunnableConfig = {"configurable": {"thread_id": "my-session"}}# First invocation: Record some memoriesresult = agent.invoke( {"messages": [HumanMessage("Remember that the project deadline is March 15th and code review deadline is March 10th")]}, config=config,)# The agent creates memory files, which are stored in stateprint("Memory files created:", list(result["memory_files"].keys()))# Second invocation: Search the memories we just recorded# State is automatically persisted via the checkpointerresult = agent.invoke( {"messages": [HumanMessage("Search my memories for project deadlines")]}, config=config,)print(result["messages"][-1].content)
I found your project deadlines in my memory! Here's what I have recorded:## Important Deadlines- **Code Review Deadline:** March 10th- **Project Deadline:** March 15th## Notes- Code review must be completed 5 days before final project deadline- Need to ensure all code is ready for review by March 10thIs there anything specific about these deadlines you'd like to know or update?