Middleware

langchain.agents.middleware ¶

Entrypoint to using Middleware plugins with Agents.

Reference docs

This page contains reference documentation for Middleware. See the docs for conceptual guides, tutorials, and examples on using Middleware.

CLASS	DESCRIPTION
`ContextEditingMiddleware`	Automatically prunes tool results to manage context size.
`HumanInTheLoopMiddleware`	Human in the loop middleware.
`LLMToolSelectorMiddleware`	Uses an LLM to select relevant tools before calling the main model.
`LLMToolEmulator`	Emulates specified tools using an LLM instead of executing them.
`ModelCallLimitMiddleware`	Tracks model call counts and enforces limits.
`ModelFallbackMiddleware`	Automatic fallback to alternative models on errors.
`PIIMiddleware`	Detect and handle Personally Identifiable Information (PII) in agent conversations.
`PIIDetectionError`	Raised when configured to block on detected sensitive values.
`SummarizationMiddleware`	Summarizes conversation history when token limits are approached.
`ToolCallLimitMiddleware`	Tracks tool call counts and enforces limits.
`AgentMiddleware`	Base middleware class for an agent.
`AgentState`	State schema for the agent.
`ClearToolUsesEdit`	Configuration for clearing tool outputs when token limits are exceeded.
`InterruptOnConfig`	Configuration for an action requiring human in the loop.
`ModelRequest`	Model request information for the agent.
`ModelResponse`	Response from model execution including messages and optional structured output.
`ModelRequest`	Model request information for the agent.

ContextEditingMiddleware ¶

Bases: AgentMiddleware

Automatically prunes tool results to manage context size.

The middleware applies a sequence of edits when the total input token count exceeds configured thresholds. Currently the ClearToolUsesEdit strategy is supported, aligning with Anthropic's clear_tool_uses_20250919 behaviour.

state_schema `class-attribute` `instance-attribute` ¶

state_schema: type[StateT] = cast('type[StateT]', AgentState)

The schema for state passed to the middleware nodes.

tools `instance-attribute` ¶

tools: list[BaseTool]

Additional tools registered by the middleware.

name `property` ¶

name: str

The name of the middleware instance.

Defaults to the class name, but can be overridden for custom naming.

before_agent ¶

before_agent(state: StateT, runtime: Runtime[ContextT]) -> dict[str, Any] | None

Logic to run before the agent execution starts.

abefore_agent `async` ¶

abefore_agent(state: StateT, runtime: Runtime[ContextT]) -> dict[str, Any] | None

Async logic to run before the agent execution starts.

before_model ¶

before_model(state: StateT, runtime: Runtime[ContextT]) -> dict[str, Any] | None

Logic to run before the model is called.

abefore_model `async` ¶

abefore_model(state: StateT, runtime: Runtime[ContextT]) -> dict[str, Any] | None

Async logic to run before the model is called.

after_model ¶

after_model(state: StateT, runtime: Runtime[ContextT]) -> dict[str, Any] | None

Logic to run after the model is called.

aafter_model `async` ¶

aafter_model(state: StateT, runtime: Runtime[ContextT]) -> dict[str, Any] | None

Async logic to run after the model is called.

after_agent ¶

after_agent(state: StateT, runtime: Runtime[ContextT]) -> dict[str, Any] | None

Logic to run after the agent execution completes.

aafter_agent `async` ¶

aafter_agent(state: StateT, runtime: Runtime[ContextT]) -> dict[str, Any] | None

Async logic to run after the agent execution completes.

wrap_tool_call ¶

wrap_tool_call(  request: ToolCallRequest,  handler: Callable[[ToolCallRequest], ToolMessage | Command], ) -> ToolMessage | Command

Intercept tool execution for retries, monitoring, or modification.

Multiple middleware compose automatically (first defined = outermost). Exceptions propagate unless handle_tool_errors is configured on ToolNode.

PARAMETER	DESCRIPTION
`request`	Tool call request with call `dict`, `BaseTool`, state, and runtime. Access state via `request.state` and runtime via `request.runtime`. TYPE: `ToolCallRequest`
`handler`	Callable to execute the tool (can be called multiple times). TYPE: `Callable[[ToolCallRequest], ToolMessage \| Command]`

RETURNS	DESCRIPTION
`ToolMessage \| Command`	`ToolMessage` or `Command` (the final result).

The handler callable can be invoked multiple times for retry logic. Each call to handler is independent and stateless.

Examples:

Modify request before execution:

def wrap_tool_call(self, request, handler):  request.tool_call["args"]["value"] *= 2  return handler(request)

Retry on error (call handler multiple times):

def wrap_tool_call(self, request, handler):  for attempt in range(3):  try:  result = handler(request)  if is_valid(result):  return result  except Exception:  if attempt == 2:  raise  return result

Conditional retry based on response:

def wrap_tool_call(self, request, handler):  for attempt in range(3):  result = handler(request)  if isinstance(result, ToolMessage) and result.status != "error":  return result  if attempt < 2:  continue  return result

awrap_tool_call `async` ¶

awrap_tool_call(  request: ToolCallRequest,  handler: Callable[[ToolCallRequest], Awaitable[ToolMessage | Command]], ) -> ToolMessage | Command

Intercept and control async tool execution via handler callback.

The handler callback executes the tool call and returns a ToolMessage or Command. Middleware can call the handler multiple times for retry logic, skip calling it to short-circuit, or modify the request/response. Multiple middleware compose with first in list as outermost layer.

PARAMETER	DESCRIPTION
`request`	Tool call request with call `dict`, `BaseTool`, state, and runtime. Access state via `request.state` and runtime via `request.runtime`. TYPE: `ToolCallRequest`
`handler`	Async callable to execute the tool and returns `ToolMessage` or `Command`. Call this to execute the tool. Can be called multiple times for retry logic. Can skip calling it to short-circuit. TYPE: `Callable[[ToolCallRequest], Awaitable[ToolMessage \| Command]]`

RETURNS	DESCRIPTION
`ToolMessage \| Command`	`ToolMessage` or `Command` (the final result).

The handler callable can be invoked multiple times for retry logic. Each call to handler is independent and stateless.

Examples:

Async retry on error:

async def awrap_tool_call(self, request, handler):  for attempt in range(3):  try:  result = await handler(request)  if is_valid(result):  return result  except Exception:  if attempt == 2:  raise  return result

async def awrap_tool_call(self, request, handler):  if cached := await get_cache_async(request):  return ToolMessage(content=cached, tool_call_id=request.tool_call["id"])  result = await handler(request)  await save_cache_async(request, result)  return result

init ¶

__init__(  *,  edits: Iterable[ContextEdit] | None = None,  token_count_method: Literal["approximate", "model"] = "approximate", ) -> None

Initializes a context editing middleware instance.

PARAMETER	DESCRIPTION
`edits`	Sequence of edit strategies to apply. Defaults to a single `ClearToolUsesEdit` mirroring Anthropic defaults. TYPE: `Iterable[ContextEdit] \| None` DEFAULT: `None`
`token_count_method`	Whether to use approximate token counting (faster, less accurate) or exact counting implemented by the chat model (potentially slower, more accurate). TYPE: `Literal['approximate', 'model']` DEFAULT: `'approximate'`

wrap_model_call ¶

wrap_model_call(  request: ModelRequest, handler: Callable[[ModelRequest], ModelResponse] ) -> ModelCallResult

Apply context edits before invoking the model via handler.

awrap_model_call `async` ¶

awrap_model_call(  request: ModelRequest, handler: Callable[[ModelRequest], Awaitable[ModelResponse]] ) -> ModelCallResult

Apply context edits before invoking the model via handler (async version).

HumanInTheLoopMiddleware ¶

Bases: AgentMiddleware

Human in the loop middleware.

state_schema `class-attribute` `instance-attribute` ¶

state_schema: type[StateT] = cast('type[StateT]', AgentState)

The schema for state passed to the middleware nodes.

tools `instance-attribute` ¶

tools: list[BaseTool]

Additional tools registered by the middleware.

name `property` ¶

name: str

The name of the middleware instance.

Defaults to the class name, but can be overridden for custom naming.

before_agent ¶

before_agent(state: StateT, runtime: Runtime[ContextT]) -> dict[str, Any] | None

Logic to run before the agent execution starts.

abefore_agent `async` ¶

abefore_agent(state: StateT, runtime: Runtime[ContextT]) -> dict[str, Any] | None

Async logic to run before the agent execution starts.

before_model ¶

before_model(state: StateT, runtime: Runtime[ContextT]) -> dict[str, Any] | None

Logic to run before the model is called.

abefore_model `async` ¶

abefore_model(state: StateT, runtime: Runtime[ContextT]) -> dict[str, Any] | None

Async logic to run before the model is called.

aafter_model `async` ¶

aafter_model(state: StateT, runtime: Runtime[ContextT]) -> dict[str, Any] | None

Async logic to run after the model is called.

wrap_model_call ¶

wrap_model_call(  request: ModelRequest, handler: Callable[[ModelRequest], ModelResponse] ) -> ModelCallResult

Intercept and control model execution via handler callback.

The handler callback executes the model request and returns a ModelResponse. Middleware can call the handler multiple times for retry logic, skip calling it to short-circuit, or modify the request/response. Multiple middleware compose with first in list as outermost layer.

PARAMETER	DESCRIPTION
`request`	Model request to execute (includes state and runtime). TYPE: `ModelRequest`
`handler`	Callback that executes the model request and returns `ModelResponse`. Call this to execute the model. Can be called multiple times for retry logic. Can skip calling it to short-circuit. TYPE: `Callable[[ModelRequest], ModelResponse]`

RETURNS	DESCRIPTION
`ModelCallResult`	`ModelCallResult`

Examples:

Retry on error:

def wrap_model_call(self, request, handler):  for attempt in range(3):  try:  return handler(request)  except Exception:  if attempt == 2:  raise

Rewrite response:

def wrap_model_call(self, request, handler):  response = handler(request)  ai_msg = response.result[0]  return ModelResponse(  result=[AIMessage(content=f"[{ai_msg.content}]")],  structured_response=response.structured_response,  )

Error to fallback:

def wrap_model_call(self, request, handler):  try:  return handler(request)  except Exception:  return ModelResponse(result=[AIMessage(content="Service unavailable")])

Cache/short-circuit:

def wrap_model_call(self, request, handler):  if cached := get_cache(request):  return cached # Short-circuit with cached result  response = handler(request)  save_cache(request, response)  return response

Simple AIMessage return (converted automatically):

def wrap_model_call(self, request, handler):  response = handler(request)  # Can return AIMessage directly for simple cases  return AIMessage(content="Simplified response")

awrap_model_call `async` ¶

awrap_model_call(  request: ModelRequest, handler: Callable[[ModelRequest], Awaitable[ModelResponse]] ) -> ModelCallResult

Intercept and control async model execution via handler callback.

The handler callback executes the model request and returns a ModelResponse. Middleware can call the handler multiple times for retry logic, skip calling it to short-circuit, or modify the request/response. Multiple middleware compose with first in list as outermost layer.

PARAMETER	DESCRIPTION
`request`	Model request to execute (includes state and runtime). TYPE: `ModelRequest`
`handler`	Async callback that executes the model request and returns `ModelResponse`. Call this to execute the model. Can be called multiple times for retry logic. Can skip calling it to short-circuit. TYPE: `Callable[[ModelRequest], Awaitable[ModelResponse]]`

RETURNS	DESCRIPTION
`ModelCallResult`	ModelCallResult

Examples:

Retry on error:

async def awrap_model_call(self, request, handler):  for attempt in range(3):  try:  return await handler(request)  except Exception:  if attempt == 2:  raise

after_agent ¶

after_agent(state: StateT, runtime: Runtime[ContextT]) -> dict[str, Any] | None

Logic to run after the agent execution completes.

aafter_agent `async` ¶

aafter_agent(state: StateT, runtime: Runtime[ContextT]) -> dict[str, Any] | None

Async logic to run after the agent execution completes.

wrap_tool_call ¶

wrap_tool_call(  request: ToolCallRequest,  handler: Callable[[ToolCallRequest], ToolMessage | Command], ) -> ToolMessage | Command

Intercept tool execution for retries, monitoring, or modification.

Multiple middleware compose automatically (first defined = outermost). Exceptions propagate unless handle_tool_errors is configured on ToolNode.

PARAMETER	DESCRIPTION
`request`	Tool call request with call `dict`, `BaseTool`, state, and runtime. Access state via `request.state` and runtime via `request.runtime`. TYPE: `ToolCallRequest`
`handler`	Callable to execute the tool (can be called multiple times). TYPE: `Callable[[ToolCallRequest], ToolMessage \| Command]`

RETURNS	DESCRIPTION
`ToolMessage \| Command`	`ToolMessage` or `Command` (the final result).

The handler callable can be invoked multiple times for retry logic. Each call to handler is independent and stateless.

Examples:

Modify request before execution:

def wrap_tool_call(self, request, handler):  request.tool_call["args"]["value"] *= 2  return handler(request)

Retry on error (call handler multiple times):

def wrap_tool_call(self, request, handler):  for attempt in range(3):  try:  result = handler(request)  if is_valid(result):  return result  except Exception:  if attempt == 2:  raise  return result

Conditional retry based on response:

def wrap_tool_call(self, request, handler):  for attempt in range(3):  result = handler(request)  if isinstance(result, ToolMessage) and result.status != "error":  return result  if attempt < 2:  continue  return result

awrap_tool_call `async` ¶

awrap_tool_call(  request: ToolCallRequest,  handler: Callable[[ToolCallRequest], Awaitable[ToolMessage | Command]], ) -> ToolMessage | Command

Intercept and control async tool execution via handler callback.

The handler callback executes the tool call and returns a ToolMessage or Command. Middleware can call the handler multiple times for retry logic, skip calling it to short-circuit, or modify the request/response. Multiple middleware compose with first in list as outermost layer.

PARAMETER	DESCRIPTION
`request`	Tool call request with call `dict`, `BaseTool`, state, and runtime. Access state via `request.state` and runtime via `request.runtime`. TYPE: `ToolCallRequest`
`handler`	Async callable to execute the tool and returns `ToolMessage` or `Command`. Call this to execute the tool. Can be called multiple times for retry logic. Can skip calling it to short-circuit. TYPE: `Callable[[ToolCallRequest], Awaitable[ToolMessage \| Command]]`

RETURNS	DESCRIPTION
`ToolMessage \| Command`	`ToolMessage` or `Command` (the final result).

The handler callable can be invoked multiple times for retry logic. Each call to handler is independent and stateless.

Examples:

Async retry on error:

async def awrap_tool_call(self, request, handler):  for attempt in range(3):  try:  result = await handler(request)  if is_valid(result):  return result  except Exception:  if attempt == 2:  raise  return result

async def awrap_tool_call(self, request, handler):  if cached := await get_cache_async(request):  return ToolMessage(content=cached, tool_call_id=request.tool_call["id"])  result = await handler(request)  await save_cache_async(request, result)  return result

init ¶

__init__(  interrupt_on: dict[str, bool | InterruptOnConfig],  *,  description_prefix: str = "Tool execution requires approval", ) -> None

Initialize the human in the loop middleware.

PARAMETER DESCRIPTION

interrupt_on

Mapping of tool name to allowed actions. If a tool doesn't have an entry, it's auto-approved by default.

True indicates all decisions are allowed: approve, edit, and reject.
False indicates that the tool is auto-approved.
InterruptOnConfig indicates the specific decisions allowed for this tool. The InterruptOnConfig can include a description field (str or Callable) for custom formatting of the interrupt description.

TYPE: dict[str, bool | InterruptOnConfig]

description_prefix

The prefix to use when constructing action requests. This is used to provide context about the tool call and the action being requested. Not used if a tool has a description in its InterruptOnConfig.

TYPE: str DEFAULT: 'Tool execution requires approval'

after_model ¶

after_model(state: AgentState, runtime: Runtime) -> dict[str, Any] | None

Trigger interrupt flows for relevant tool calls after an AIMessage.

LLMToolSelectorMiddleware ¶

Bases: AgentMiddleware

Uses an LLM to select relevant tools before calling the main model.

When an agent has many tools available, this middleware filters them down to only the most relevant ones for the user's query. This reduces token usage and helps the main model focus on the right tools.

Examples:

Limit to 3 tools:

from langchain.agents.middleware import LLMToolSelectorMiddleware  middleware = LLMToolSelectorMiddleware(max_tools=3)  agent = create_agent(  model="openai:gpt-4o",  tools=[tool1, tool2, tool3, tool4, tool5],  middleware=[middleware], )

Use a smaller model for selection:

middleware = LLMToolSelectorMiddleware(model="openai:gpt-4o-mini", max_tools=2)

state_schema `class-attribute` `instance-attribute` ¶

state_schema: type[StateT] = cast('type[StateT]', AgentState)

The schema for state passed to the middleware nodes.

tools `instance-attribute` ¶

tools: list[BaseTool]

Additional tools registered by the middleware.

name `property` ¶

name: str

The name of the middleware instance.

Defaults to the class name, but can be overridden for custom naming.

before_agent ¶

before_agent(state: StateT, runtime: Runtime[ContextT]) -> dict[str, Any] | None

Logic to run before the agent execution starts.

abefore_agent `async` ¶

abefore_agent(state: StateT, runtime: Runtime[ContextT]) -> dict[str, Any] | None

Async logic to run before the agent execution starts.

before_model ¶

before_model(state: StateT, runtime: Runtime[ContextT]) -> dict[str, Any] | None

Logic to run before the model is called.

abefore_model `async` ¶

abefore_model(state: StateT, runtime: Runtime[ContextT]) -> dict[str, Any] | None

Async logic to run before the model is called.

after_model ¶

after_model(state: StateT, runtime: Runtime[ContextT]) -> dict[str, Any] | None

Logic to run after the model is called.

aafter_model `async` ¶

aafter_model(state: StateT, runtime: Runtime[ContextT]) -> dict[str, Any] | None

Async logic to run after the model is called.

after_agent ¶

after_agent(state: StateT, runtime: Runtime[ContextT]) -> dict[str, Any] | None

Logic to run after the agent execution completes.

aafter_agent `async` ¶

aafter_agent(state: StateT, runtime: Runtime[ContextT]) -> dict[str, Any] | None

Async logic to run after the agent execution completes.

wrap_tool_call ¶

wrap_tool_call(  request: ToolCallRequest,  handler: Callable[[ToolCallRequest], ToolMessage | Command], ) -> ToolMessage | Command

Intercept tool execution for retries, monitoring, or modification.

Multiple middleware compose automatically (first defined = outermost). Exceptions propagate unless handle_tool_errors is configured on ToolNode.

PARAMETER	DESCRIPTION
`request`	Tool call request with call `dict`, `BaseTool`, state, and runtime. Access state via `request.state` and runtime via `request.runtime`. TYPE: `ToolCallRequest`
`handler`	Callable to execute the tool (can be called multiple times). TYPE: `Callable[[ToolCallRequest], ToolMessage \| Command]`

RETURNS	DESCRIPTION
`ToolMessage \| Command`	`ToolMessage` or `Command` (the final result).

The handler callable can be invoked multiple times for retry logic. Each call to handler is independent and stateless.

Examples:

Modify request before execution:

def wrap_tool_call(self, request, handler):  request.tool_call["args"]["value"] *= 2  return handler(request)

Retry on error (call handler multiple times):

def wrap_tool_call(self, request, handler):  for attempt in range(3):  try:  result = handler(request)  if is_valid(result):  return result  except Exception:  if attempt == 2:  raise  return result

Conditional retry based on response:

def wrap_tool_call(self, request, handler):  for attempt in range(3):  result = handler(request)  if isinstance(result, ToolMessage) and result.status != "error":  return result  if attempt < 2:  continue  return result

awrap_tool_call `async` ¶

awrap_tool_call(  request: ToolCallRequest,  handler: Callable[[ToolCallRequest], Awaitable[ToolMessage | Command]], ) -> ToolMessage | Command

Intercept and control async tool execution via handler callback.

The handler callback executes the tool call and returns a ToolMessage or Command. Middleware can call the handler multiple times for retry logic, skip calling it to short-circuit, or modify the request/response. Multiple middleware compose with first in list as outermost layer.

PARAMETER	DESCRIPTION
`request`	Tool call request with call `dict`, `BaseTool`, state, and runtime. Access state via `request.state` and runtime via `request.runtime`. TYPE: `ToolCallRequest`
`handler`	Async callable to execute the tool and returns `ToolMessage` or `Command`. Call this to execute the tool. Can be called multiple times for retry logic. Can skip calling it to short-circuit. TYPE: `Callable[[ToolCallRequest], Awaitable[ToolMessage \| Command]]`

RETURNS	DESCRIPTION
`ToolMessage \| Command`	`ToolMessage` or `Command` (the final result).

The handler callable can be invoked multiple times for retry logic. Each call to handler is independent and stateless.

Examples:

Async retry on error:

async def awrap_tool_call(self, request, handler):  for attempt in range(3):  try:  result = await handler(request)  if is_valid(result):  return result  except Exception:  if attempt == 2:  raise  return result

async def awrap_tool_call(self, request, handler):  if cached := await get_cache_async(request):  return ToolMessage(content=cached, tool_call_id=request.tool_call["id"])  result = await handler(request)  await save_cache_async(request, result)  return result

init ¶

__init__(  *,  model: str | BaseChatModel | None = None,  system_prompt: str = DEFAULT_SYSTEM_PROMPT,  max_tools: int | None = None,  always_include: list[str] | None = None, ) -> None

Initialize the tool selector.

PARAMETER	DESCRIPTION
`model`	Model to use for selection. If not provided, uses the agent's main model. Can be a model identifier string or BaseChatModel instance. TYPE: `str \| BaseChatModel \| None` DEFAULT: `None`
`system_prompt`	Instructions for the selection model. TYPE: `str` DEFAULT: `DEFAULT_SYSTEM_PROMPT`
`max_tools`	Maximum number of tools to select. If the model selects more, only the first max_tools will be used. No limit if not specified. TYPE: `int \| None` DEFAULT: `None`
`always_include`	Tool names to always include regardless of selection. These do not count against the max_tools limit. TYPE: `list[str] \| None` DEFAULT: `None`

wrap_model_call ¶

wrap_model_call(  request: ModelRequest, handler: Callable[[ModelRequest], ModelResponse] ) -> ModelCallResult

Filter tools based on LLM selection before invoking the model via handler.

awrap_model_call `async` ¶

awrap_model_call(  request: ModelRequest, handler: Callable[[ModelRequest], Awaitable[ModelResponse]] ) -> ModelCallResult

Filter tools based on LLM selection before invoking the model via handler.

LLMToolEmulator ¶

Bases: AgentMiddleware

Emulates specified tools using an LLM instead of executing them.

This middleware allows selective emulation of tools for testing purposes. By default (when tools=None), all tools are emulated. You can specify which tools to emulate by passing a list of tool names or BaseTool instances.

Examples:

Emulate all tools (default behavior):

from langchain.agents.middleware import LLMToolEmulator  middleware = LLMToolEmulator()  agent = create_agent(  model="openai:gpt-4o",  tools=[get_weather, get_user_location, calculator],  middleware=[middleware], )

Emulate specific tools by name:

middleware = LLMToolEmulator(tools=["get_weather", "get_user_location"])

Use a custom model for emulation:

middleware = LLMToolEmulator(  tools=["get_weather"], model="anthropic:claude-sonnet-4-5-20250929" )

Emulate specific tools by passing tool instances:

middleware = LLMToolEmulator(tools=[get_weather, get_user_location])

state_schema `class-attribute` `instance-attribute` ¶

state_schema: type[StateT] = cast('type[StateT]', AgentState)

The schema for state passed to the middleware nodes.

tools `instance-attribute` ¶

tools: list[BaseTool]

Additional tools registered by the middleware.

name `property` ¶

name: str

The name of the middleware instance.

Defaults to the class name, but can be overridden for custom naming.

before_agent ¶

before_agent(state: StateT, runtime: Runtime[ContextT]) -> dict[str, Any] | None

Logic to run before the agent execution starts.

abefore_agent `async` ¶

abefore_agent(state: StateT, runtime: Runtime[ContextT]) -> dict[str, Any] | None

Async logic to run before the agent execution starts.

before_model ¶

before_model(state: StateT, runtime: Runtime[ContextT]) -> dict[str, Any] | None

Logic to run before the model is called.

abefore_model `async` ¶

abefore_model(state: StateT, runtime: Runtime[ContextT]) -> dict[str, Any] | None

Async logic to run before the model is called.

after_model ¶

after_model(state: StateT, runtime: Runtime[ContextT]) -> dict[str, Any] | None

Logic to run after the model is called.

aafter_model `async` ¶

aafter_model(state: StateT, runtime: Runtime[ContextT]) -> dict[str, Any] | None

Async logic to run after the model is called.

wrap_model_call ¶

wrap_model_call(  request: ModelRequest, handler: Callable[[ModelRequest], ModelResponse] ) -> ModelCallResult

Intercept and control model execution via handler callback.

The handler callback executes the model request and returns a ModelResponse. Middleware can call the handler multiple times for retry logic, skip calling it to short-circuit, or modify the request/response. Multiple middleware compose with first in list as outermost layer.

PARAMETER	DESCRIPTION
`request`	Model request to execute (includes state and runtime). TYPE: `ModelRequest`
`handler`	Callback that executes the model request and returns `ModelResponse`. Call this to execute the model. Can be called multiple times for retry logic. Can skip calling it to short-circuit. TYPE: `Callable[[ModelRequest], ModelResponse]`

RETURNS	DESCRIPTION
`ModelCallResult`	`ModelCallResult`

Examples:

Retry on error:

def wrap_model_call(self, request, handler):  for attempt in range(3):  try:  return handler(request)  except Exception:  if attempt == 2:  raise

Rewrite response:

def wrap_model_call(self, request, handler):  response = handler(request)  ai_msg = response.result[0]  return ModelResponse(  result=[AIMessage(content=f"[{ai_msg.content}]")],  structured_response=response.structured_response,  )

Error to fallback:

def wrap_model_call(self, request, handler):  try:  return handler(request)  except Exception:  return ModelResponse(result=[AIMessage(content="Service unavailable")])

Cache/short-circuit:

def wrap_model_call(self, request, handler):  if cached := get_cache(request):  return cached # Short-circuit with cached result  response = handler(request)  save_cache(request, response)  return response

Simple AIMessage return (converted automatically):

def wrap_model_call(self, request, handler):  response = handler(request)  # Can return AIMessage directly for simple cases  return AIMessage(content="Simplified response")

awrap_model_call `async` ¶

awrap_model_call(  request: ModelRequest, handler: Callable[[ModelRequest], Awaitable[ModelResponse]] ) -> ModelCallResult

Intercept and control async model execution via handler callback.

The handler callback executes the model request and returns a ModelResponse. Middleware can call the handler multiple times for retry logic, skip calling it to short-circuit, or modify the request/response. Multiple middleware compose with first in list as outermost layer.

PARAMETER	DESCRIPTION
`request`	Model request to execute (includes state and runtime). TYPE: `ModelRequest`
`handler`	Async callback that executes the model request and returns `ModelResponse`. Call this to execute the model. Can be called multiple times for retry logic. Can skip calling it to short-circuit. TYPE: `Callable[[ModelRequest], Awaitable[ModelResponse]]`

RETURNS	DESCRIPTION
`ModelCallResult`	ModelCallResult

Examples:

Retry on error:

async def awrap_model_call(self, request, handler):  for attempt in range(3):  try:  return await handler(request)  except Exception:  if attempt == 2:  raise

after_agent ¶

after_agent(state: StateT, runtime: Runtime[ContextT]) -> dict[str, Any] | None

Logic to run after the agent execution completes.

aafter_agent `async` ¶

aafter_agent(state: StateT, runtime: Runtime[ContextT]) -> dict[str, Any] | None

Async logic to run after the agent execution completes.

init ¶

__init__(  *,  tools: list[str | BaseTool] | None = None,  model: str | BaseChatModel | None = None, ) -> None

Initialize the tool emulator.

PARAMETER	DESCRIPTION
`tools`	List of tool names (str) or BaseTool instances to emulate. If None (default), ALL tools will be emulated. If empty list, no tools will be emulated. TYPE: `list[str \| BaseTool] \| None` DEFAULT: `None`
`model`	Model to use for emulation. Defaults to "anthropic:claude-sonnet-4-5-20250929". Can be a model identifier string or BaseChatModel instance. TYPE: `str \| BaseChatModel \| None` DEFAULT: `None`

wrap_tool_call ¶

wrap_tool_call(  request: ToolCallRequest,  handler: Callable[[ToolCallRequest], ToolMessage | Command], ) -> ToolMessage | Command

Emulate tool execution using LLM if tool should be emulated.

PARAMETER	DESCRIPTION
`request`	Tool call request to potentially emulate. TYPE: `ToolCallRequest`
`handler`	Callback to execute the tool (can be called multiple times). TYPE: `Callable[[ToolCallRequest], ToolMessage \| Command]`

RETURNS	DESCRIPTION
`ToolMessage \| Command`	ToolMessage with emulated response if tool should be emulated,
`ToolMessage \| Command`	otherwise calls handler for normal execution.

awrap_tool_call `async` ¶

awrap_tool_call(  request: ToolCallRequest,  handler: Callable[[ToolCallRequest], Awaitable[ToolMessage | Command]], ) -> ToolMessage | Command

Async version of wrap_tool_call.

Emulate tool execution using LLM if tool should be emulated.

PARAMETER	DESCRIPTION
`request`	Tool call request to potentially emulate. TYPE: `ToolCallRequest`
`handler`	Async callback to execute the tool (can be called multiple times). TYPE: `Callable[[ToolCallRequest], Awaitable[ToolMessage \| Command]]`

RETURNS	DESCRIPTION
`ToolMessage \| Command`	ToolMessage with emulated response if tool should be emulated,
`ToolMessage \| Command`	otherwise calls handler for normal execution.

ModelCallLimitMiddleware ¶

Bases: AgentMiddleware[ModelCallLimitState, Any]

Tracks model call counts and enforces limits.

This middleware monitors the number of model calls made during agent execution and can terminate the agent when specified limits are reached. It supports both thread-level and run-level call counting with configurable exit behaviors.

Thread-level: The middleware tracks the number of model calls and persists call count across multiple runs (invocations) of the agent.

Run-level: The middleware tracks the number of model calls made during a single run (invocation) of the agent.

Example

from langchain.agents.middleware.call_tracking import ModelCallLimitMiddleware from langchain.agents import create_agent  # Create middleware with limits call_tracker = ModelCallLimitMiddleware(thread_limit=10, run_limit=5, exit_behavior="end")  agent = create_agent("openai:gpt-4o", middleware=[call_tracker])  # Agent will automatically jump to end when limits are exceeded result = await agent.invoke({"messages": [HumanMessage("Help me with a task")]})

tools `instance-attribute` ¶

tools: list[BaseTool]

Additional tools registered by the middleware.

name `property` ¶

name: str

The name of the middleware instance.

Defaults to the class name, but can be overridden for custom naming.

before_agent ¶

before_agent(state: StateT, runtime: Runtime[ContextT]) -> dict[str, Any] | None

Logic to run before the agent execution starts.

abefore_agent `async` ¶

abefore_agent(state: StateT, runtime: Runtime[ContextT]) -> dict[str, Any] | None

Async logic to run before the agent execution starts.

abefore_model `async` ¶

abefore_model(state: StateT, runtime: Runtime[ContextT]) -> dict[str, Any] | None

Async logic to run before the model is called.

aafter_model `async` ¶

aafter_model(state: StateT, runtime: Runtime[ContextT]) -> dict[str, Any] | None

Async logic to run after the model is called.

wrap_model_call ¶

wrap_model_call(  request: ModelRequest, handler: Callable[[ModelRequest], ModelResponse] ) -> ModelCallResult

Intercept and control model execution via handler callback.

The handler callback executes the model request and returns a ModelResponse. Middleware can call the handler multiple times for retry logic, skip calling it to short-circuit, or modify the request/response. Multiple middleware compose with first in list as outermost layer.

PARAMETER	DESCRIPTION
`request`	Model request to execute (includes state and runtime). TYPE: `ModelRequest`
`handler`	Callback that executes the model request and returns `ModelResponse`. Call this to execute the model. Can be called multiple times for retry logic. Can skip calling it to short-circuit. TYPE: `Callable[[ModelRequest], ModelResponse]`

RETURNS	DESCRIPTION
`ModelCallResult`	`ModelCallResult`

Examples:

Retry on error:

def wrap_model_call(self, request, handler):  for attempt in range(3):  try:  return handler(request)  except Exception:  if attempt == 2:  raise

Rewrite response:

def wrap_model_call(self, request, handler):  response = handler(request)  ai_msg = response.result[0]  return ModelResponse(  result=[AIMessage(content=f"[{ai_msg.content}]")],  structured_response=response.structured_response,  )

Error to fallback:

def wrap_model_call(self, request, handler):  try:  return handler(request)  except Exception:  return ModelResponse(result=[AIMessage(content="Service unavailable")])

Cache/short-circuit:

def wrap_model_call(self, request, handler):  if cached := get_cache(request):  return cached # Short-circuit with cached result  response = handler(request)  save_cache(request, response)  return response

Simple AIMessage return (converted automatically):

def wrap_model_call(self, request, handler):  response = handler(request)  # Can return AIMessage directly for simple cases  return AIMessage(content="Simplified response")

awrap_model_call `async` ¶

awrap_model_call(  request: ModelRequest, handler: Callable[[ModelRequest], Awaitable[ModelResponse]] ) -> ModelCallResult

Intercept and control async model execution via handler callback.

The handler callback executes the model request and returns a ModelResponse. Middleware can call the handler multiple times for retry logic, skip calling it to short-circuit, or modify the request/response. Multiple middleware compose with first in list as outermost layer.

PARAMETER	DESCRIPTION
`request`	Model request to execute (includes state and runtime). TYPE: `ModelRequest`
`handler`	Async callback that executes the model request and returns `ModelResponse`. Call this to execute the model. Can be called multiple times for retry logic. Can skip calling it to short-circuit. TYPE: `Callable[[ModelRequest], Awaitable[ModelResponse]]`

RETURNS	DESCRIPTION
`ModelCallResult`	ModelCallResult

Examples:

Retry on error:

async def awrap_model_call(self, request, handler):  for attempt in range(3):  try:  return await handler(request)  except Exception:  if attempt == 2:  raise

after_agent ¶

after_agent(state: StateT, runtime: Runtime[ContextT]) -> dict[str, Any] | None

Logic to run after the agent execution completes.

aafter_agent `async` ¶

aafter_agent(state: StateT, runtime: Runtime[ContextT]) -> dict[str, Any] | None

Async logic to run after the agent execution completes.

wrap_tool_call ¶

wrap_tool_call(  request: ToolCallRequest,  handler: Callable[[ToolCallRequest], ToolMessage | Command], ) -> ToolMessage | Command

Intercept tool execution for retries, monitoring, or modification.

Multiple middleware compose automatically (first defined = outermost). Exceptions propagate unless handle_tool_errors is configured on ToolNode.

PARAMETER	DESCRIPTION
`request`	Tool call request with call `dict`, `BaseTool`, state, and runtime. Access state via `request.state` and runtime via `request.runtime`. TYPE: `ToolCallRequest`
`handler`	Callable to execute the tool (can be called multiple times). TYPE: `Callable[[ToolCallRequest], ToolMessage \| Command]`

RETURNS	DESCRIPTION
`ToolMessage \| Command`	`ToolMessage` or `Command` (the final result).

The handler callable can be invoked multiple times for retry logic. Each call to handler is independent and stateless.

Examples:

Modify request before execution:

def wrap_tool_call(self, request, handler):  request.tool_call["args"]["value"] *= 2  return handler(request)

Retry on error (call handler multiple times):

def wrap_tool_call(self, request, handler):  for attempt in range(3):  try:  result = handler(request)  if is_valid(result):  return result  except Exception:  if attempt == 2:  raise  return result

Conditional retry based on response:

def wrap_tool_call(self, request, handler):  for attempt in range(3):  result = handler(request)  if isinstance(result, ToolMessage) and result.status != "error":  return result  if attempt < 2:  continue  return result

awrap_tool_call `async` ¶

awrap_tool_call(  request: ToolCallRequest,  handler: Callable[[ToolCallRequest], Awaitable[ToolMessage | Command]], ) -> ToolMessage | Command

Intercept and control async tool execution via handler callback.

The handler callback executes the tool call and returns a ToolMessage or Command. Middleware can call the handler multiple times for retry logic, skip calling it to short-circuit, or modify the request/response. Multiple middleware compose with first in list as outermost layer.

PARAMETER	DESCRIPTION
`request`	Tool call request with call `dict`, `BaseTool`, state, and runtime. Access state via `request.state` and runtime via `request.runtime`. TYPE: `ToolCallRequest`
`handler`	Async callable to execute the tool and returns `ToolMessage` or `Command`. Call this to execute the tool. Can be called multiple times for retry logic. Can skip calling it to short-circuit. TYPE: `Callable[[ToolCallRequest], Awaitable[ToolMessage \| Command]]`

RETURNS	DESCRIPTION
`ToolMessage \| Command`	`ToolMessage` or `Command` (the final result).

The handler callable can be invoked multiple times for retry logic. Each call to handler is independent and stateless.

Examples:

Async retry on error:

async def awrap_tool_call(self, request, handler):  for attempt in range(3):  try:  result = await handler(request)  if is_valid(result):  return result  except Exception:  if attempt == 2:  raise  return result

async def awrap_tool_call(self, request, handler):  if cached := await get_cache_async(request):  return ToolMessage(content=cached, tool_call_id=request.tool_call["id"])  result = await handler(request)  await save_cache_async(request, result)  return result

state_schema `class-attribute` `instance-attribute` ¶

state_schema = ModelCallLimitState

The schema for state passed to the middleware nodes.

init ¶

__init__(  *,  thread_limit: int | None = None,  run_limit: int | None = None,  exit_behavior: Literal["end", "error"] = "end", ) -> None

Initialize the call tracking middleware.

PARAMETER	DESCRIPTION
`thread_limit`	Maximum number of model calls allowed per thread. None means no limit. TYPE: `int \| None` DEFAULT: `None`
`run_limit`	Maximum number of model calls allowed per run. None means no limit. TYPE: `int \| None` DEFAULT: `None`
`exit_behavior`	What to do when limits are exceeded. - "end": Jump to the end of the agent execution and inject an artificial AI message indicating that the limit was exceeded. - "error": Raise a `ModelCallLimitExceededError` TYPE: `Literal['end', 'error']` DEFAULT: `'end'`

RAISES	DESCRIPTION
`ValueError`	If both limits are `None` or if `exit_behavior` is invalid.

before_model ¶

before_model(state: ModelCallLimitState, runtime: Runtime) -> dict[str, Any] | None

Check model call limits before making a model call.

PARAMETER	DESCRIPTION
`state`	The current agent state containing call counts. TYPE: `ModelCallLimitState`
`runtime`	The langgraph runtime. TYPE: `Runtime`

RETURNS	DESCRIPTION
`dict[str, Any] \| None`	If limits are exceeded and exit_behavior is "end", returns
`dict[str, Any] \| None`	a Command to jump to the end with a limit exceeded message. Otherwise returns None.

RAISES	DESCRIPTION
`ModelCallLimitExceededError`	If limits are exceeded and exit_behavior is "error".

after_model ¶

after_model(state: ModelCallLimitState, runtime: Runtime) -> dict[str, Any] | None

Increment model call counts after a model call.

PARAMETER	DESCRIPTION
`state`	The current agent state. TYPE: `ModelCallLimitState`
`runtime`	The langgraph runtime. TYPE: `Runtime`

RETURNS	DESCRIPTION
`dict[str, Any] \| None`	State updates with incremented call counts.

ModelFallbackMiddleware ¶

Bases: AgentMiddleware

Automatic fallback to alternative models on errors.

Retries failed model calls with alternative models in sequence until success or all models exhausted. Primary model specified in create_agent().

Example

from langchain.agents.middleware.model_fallback import ModelFallbackMiddleware from langchain.agents import create_agent  fallback = ModelFallbackMiddleware(  "openai:gpt-4o-mini", # Try first on error  "anthropic:claude-sonnet-4-5-20250929", # Then this )  agent = create_agent(  model="openai:gpt-4o", # Primary model  middleware=[fallback], )  # If primary fails: tries gpt-4o-mini, then claude-sonnet-4-5-20250929 result = await agent.invoke({"messages": [HumanMessage("Hello")]})

state_schema `class-attribute` `instance-attribute` ¶

state_schema: type[StateT] = cast('type[StateT]', AgentState)

The schema for state passed to the middleware nodes.

tools `instance-attribute` ¶

tools: list[BaseTool]

Additional tools registered by the middleware.

name `property` ¶

name: str

The name of the middleware instance.

Defaults to the class name, but can be overridden for custom naming.

before_agent ¶

before_agent(state: StateT, runtime: Runtime[ContextT]) -> dict[str, Any] | None

Logic to run before the agent execution starts.

abefore_agent `async` ¶

abefore_agent(state: StateT, runtime: Runtime[ContextT]) -> dict[str, Any] | None

Async logic to run before the agent execution starts.

before_model ¶

before_model(state: StateT, runtime: Runtime[ContextT]) -> dict[str, Any] | None

Logic to run before the model is called.

abefore_model `async` ¶

abefore_model(state: StateT, runtime: Runtime[ContextT]) -> dict[str, Any] | None

Async logic to run before the model is called.

after_model ¶

after_model(state: StateT, runtime: Runtime[ContextT]) -> dict[str, Any] | None

Logic to run after the model is called.

aafter_model `async` ¶

aafter_model(state: StateT, runtime: Runtime[ContextT]) -> dict[str, Any] | None

Async logic to run after the model is called.

after_agent ¶

after_agent(state: StateT, runtime: Runtime[ContextT]) -> dict[str, Any] | None

Logic to run after the agent execution completes.

aafter_agent `async` ¶

aafter_agent(state: StateT, runtime: Runtime[ContextT]) -> dict[str, Any] | None

Async logic to run after the agent execution completes.

wrap_tool_call ¶

wrap_tool_call(  request: ToolCallRequest,  handler: Callable[[ToolCallRequest], ToolMessage | Command], ) -> ToolMessage | Command

Intercept tool execution for retries, monitoring, or modification.

Multiple middleware compose automatically (first defined = outermost). Exceptions propagate unless handle_tool_errors is configured on ToolNode.

PARAMETER	DESCRIPTION
`request`	Tool call request with call `dict`, `BaseTool`, state, and runtime. Access state via `request.state` and runtime via `request.runtime`. TYPE: `ToolCallRequest`
`handler`	Callable to execute the tool (can be called multiple times). TYPE: `Callable[[ToolCallRequest], ToolMessage \| Command]`

RETURNS	DESCRIPTION
`ToolMessage \| Command`	`ToolMessage` or `Command` (the final result).

The handler callable can be invoked multiple times for retry logic. Each call to handler is independent and stateless.

Examples:

Modify request before execution:

def wrap_tool_call(self, request, handler):  request.tool_call["args"]["value"] *= 2  return handler(request)

Retry on error (call handler multiple times):

def wrap_tool_call(self, request, handler):  for attempt in range(3):  try:  result = handler(request)  if is_valid(result):  return result  except Exception:  if attempt == 2:  raise  return result

Conditional retry based on response:

def wrap_tool_call(self, request, handler):  for attempt in range(3):  result = handler(request)  if isinstance(result, ToolMessage) and result.status != "error":  return result  if attempt < 2:  continue  return result

awrap_tool_call `async` ¶

awrap_tool_call(  request: ToolCallRequest,  handler: Callable[[ToolCallRequest], Awaitable[ToolMessage | Command]], ) -> ToolMessage | Command

Intercept and control async tool execution via handler callback.

The handler callback executes the tool call and returns a ToolMessage or Command. Middleware can call the handler multiple times for retry logic, skip calling it to short-circuit, or modify the request/response. Multiple middleware compose with first in list as outermost layer.

PARAMETER	DESCRIPTION
`request`	Tool call request with call `dict`, `BaseTool`, state, and runtime. Access state via `request.state` and runtime via `request.runtime`. TYPE: `ToolCallRequest`
`handler`	Async callable to execute the tool and returns `ToolMessage` or `Command`. Call this to execute the tool. Can be called multiple times for retry logic. Can skip calling it to short-circuit. TYPE: `Callable[[ToolCallRequest], Awaitable[ToolMessage \| Command]]`

RETURNS	DESCRIPTION
`ToolMessage \| Command`	`ToolMessage` or `Command` (the final result).

The handler callable can be invoked multiple times for retry logic. Each call to handler is independent and stateless.

Examples:

Async retry on error:

async def awrap_tool_call(self, request, handler):  for attempt in range(3):  try:  result = await handler(request)  if is_valid(result):  return result  except Exception:  if attempt == 2:  raise  return result

async def awrap_tool_call(self, request, handler):  if cached := await get_cache_async(request):  return ToolMessage(content=cached, tool_call_id=request.tool_call["id"])  result = await handler(request)  await save_cache_async(request, result)  return result

init ¶

__init__(  first_model: str | BaseChatModel, *additional_models: str | BaseChatModel ) -> None

Initialize model fallback middleware.

PARAMETER	DESCRIPTION
`first_model`	First fallback model (string name or instance). TYPE: `str \| BaseChatModel`
`*additional_models`	Additional fallbacks in order. TYPE: `str \| BaseChatModel` DEFAULT: `()`

wrap_model_call ¶

wrap_model_call(  request: ModelRequest, handler: Callable[[ModelRequest], ModelResponse] ) -> ModelCallResult

Try fallback models in sequence on errors.

PARAMETER	DESCRIPTION
`request`	Initial model request. TYPE: `ModelRequest`
`handler`	Callback to execute the model. TYPE: `Callable[[ModelRequest], ModelResponse]`

RETURNS	DESCRIPTION
`ModelCallResult`	AIMessage from successful model call.

RAISES	DESCRIPTION
`Exception`	If all models fail, re-raises last exception.

awrap_model_call `async` ¶

awrap_model_call(  request: ModelRequest, handler: Callable[[ModelRequest], Awaitable[ModelResponse]] ) -> ModelCallResult

Try fallback models in sequence on errors (async version).

PARAMETER	DESCRIPTION
`request`	Initial model request. TYPE: `ModelRequest`
`handler`	Async callback to execute the model. TYPE: `Callable[[ModelRequest], Awaitable[ModelResponse]]`

RETURNS	DESCRIPTION
`ModelCallResult`	AIMessage from successful model call.

RAISES	DESCRIPTION
`Exception`	If all models fail, re-raises last exception.

PIIMiddleware ¶

Bases: AgentMiddleware

Detect and handle Personally Identifiable Information (PII) in agent conversations.

This middleware detects common PII types and applies configurable strategies to handle them. It can detect emails, credit cards, IP addresses, MAC addresses, and URLs in both user input and agent output.

Built-in PII types

email: Email addresses
credit_card: Credit card numbers (validated with Luhn algorithm)
ip: IP addresses (validated with stdlib)
mac_address: MAC addresses
url: URLs (both http/https and bare URLs)

Strategies

block: Raise an exception when PII is detected
redact: Replace PII with [REDACTED_TYPE] placeholders
mask: Partially mask PII (e.g., ****-****-****-1234 for credit card)
hash: Replace PII with deterministic hash (e.g., <email_hash:a1b2c3d4>)

Strategy Selection Guide:

Strategy	Preserves Identity?	Best For
`block`	N/A	Avoid PII completely
`redact`	No	General compliance, log sanitization
`mask`	No	Human readability, customer service UIs
`hash`	Yes (pseudonymous)	Analytics, debugging

Example

from langchain.agents.middleware import PIIMiddleware from langchain.agents import create_agent  # Redact all emails in user input agent = create_agent(  "openai:gpt-5",  middleware=[  PIIMiddleware("email", strategy="redact"),  ], )  # Use different strategies for different PII types agent = create_agent(  "openai:gpt-4o",  middleware=[  PIIMiddleware("credit_card", strategy="mask"),  PIIMiddleware("url", strategy="redact"),  PIIMiddleware("ip", strategy="hash"),  ], )  # Custom PII type with regex agent = create_agent(  "openai:gpt-5",  middleware=[  PIIMiddleware("api_key", detector=r"sk-[a-zA-Z0-9]{32}", strategy="block"),  ], )

state_schema `class-attribute` `instance-attribute` ¶

state_schema: type[StateT] = cast('type[StateT]', AgentState)

The schema for state passed to the middleware nodes.

tools `instance-attribute` ¶

tools: list[BaseTool]

Additional tools registered by the middleware.

before_agent ¶

before_agent(state: StateT, runtime: Runtime[ContextT]) -> dict[str, Any] | None

Logic to run before the agent execution starts.

abefore_agent `async` ¶

abefore_agent(state: StateT, runtime: Runtime[ContextT]) -> dict[str, Any] | None

Async logic to run before the agent execution starts.

abefore_model `async` ¶

abefore_model(state: StateT, runtime: Runtime[ContextT]) -> dict[str, Any] | None

Async logic to run before the model is called.

aafter_model `async` ¶

aafter_model(state: StateT, runtime: Runtime[ContextT]) -> dict[str, Any] | None

Async logic to run after the model is called.

wrap_model_call ¶

wrap_model_call(  request: ModelRequest, handler: Callable[[ModelRequest], ModelResponse] ) -> ModelCallResult

Intercept and control model execution via handler callback.

The handler callback executes the model request and returns a ModelResponse. Middleware can call the handler multiple times for retry logic, skip calling it to short-circuit, or modify the request/response. Multiple middleware compose with first in list as outermost layer.

PARAMETER	DESCRIPTION
`request`	Model request to execute (includes state and runtime). TYPE: `ModelRequest`
`handler`	Callback that executes the model request and returns `ModelResponse`. Call this to execute the model. Can be called multiple times for retry logic. Can skip calling it to short-circuit. TYPE: `Callable[[ModelRequest], ModelResponse]`

RETURNS	DESCRIPTION
`ModelCallResult`	`ModelCallResult`

Examples:

Retry on error:

def wrap_model_call(self, request, handler):  for attempt in range(3):  try:  return handler(request)  except Exception:  if attempt == 2:  raise

Rewrite response:

def wrap_model_call(self, request, handler):  response = handler(request)  ai_msg = response.result[0]  return ModelResponse(  result=[AIMessage(content=f"[{ai_msg.content}]")],  structured_response=response.structured_response,  )

Error to fallback:

def wrap_model_call(self, request, handler):  try:  return handler(request)  except Exception:  return ModelResponse(result=[AIMessage(content="Service unavailable")])

Cache/short-circuit:

def wrap_model_call(self, request, handler):  if cached := get_cache(request):  return cached # Short-circuit with cached result  response = handler(request)  save_cache(request, response)  return response

Simple AIMessage return (converted automatically):

def wrap_model_call(self, request, handler):  response = handler(request)  # Can return AIMessage directly for simple cases  return AIMessage(content="Simplified response")

awrap_model_call `async` ¶

awrap_model_call(  request: ModelRequest, handler: Callable[[ModelRequest], Awaitable[ModelResponse]] ) -> ModelCallResult

Intercept and control async model execution via handler callback.

The handler callback executes the model request and returns a ModelResponse. Middleware can call the handler multiple times for retry logic, skip calling it to short-circuit, or modify the request/response. Multiple middleware compose with first in list as outermost layer.

PARAMETER	DESCRIPTION
`request`	Model request to execute (includes state and runtime). TYPE: `ModelRequest`
`handler`	Async callback that executes the model request and returns `ModelResponse`. Call this to execute the model. Can be called multiple times for retry logic. Can skip calling it to short-circuit. TYPE: `Callable[[ModelRequest], Awaitable[ModelResponse]]`

RETURNS	DESCRIPTION
`ModelCallResult`	ModelCallResult

Examples:

Retry on error:

async def awrap_model_call(self, request, handler):  for attempt in range(3):  try:  return await handler(request)  except Exception:  if attempt == 2:  raise

after_agent ¶

after_agent(state: StateT, runtime: Runtime[ContextT]) -> dict[str, Any] | None

Logic to run after the agent execution completes.

aafter_agent `async` ¶

aafter_agent(state: StateT, runtime: Runtime[ContextT]) -> dict[str, Any] | None

Async logic to run after the agent execution completes.

wrap_tool_call ¶

wrap_tool_call(  request: ToolCallRequest,  handler: Callable[[ToolCallRequest], ToolMessage | Command], ) -> ToolMessage | Command

Intercept tool execution for retries, monitoring, or modification.

Multiple middleware compose automatically (first defined = outermost). Exceptions propagate unless handle_tool_errors is configured on ToolNode.

PARAMETER	DESCRIPTION
`request`	Tool call request with call `dict`, `BaseTool`, state, and runtime. Access state via `request.state` and runtime via `request.runtime`. TYPE: `ToolCallRequest`
`handler`	Callable to execute the tool (can be called multiple times). TYPE: `Callable[[ToolCallRequest], ToolMessage \| Command]`

RETURNS	DESCRIPTION
`ToolMessage \| Command`	`ToolMessage` or `Command` (the final result).

The handler callable can be invoked multiple times for retry logic. Each call to handler is independent and stateless.

Examples:

Modify request before execution:

def wrap_tool_call(self, request, handler):  request.tool_call["args"]["value"] *= 2  return handler(request)

Retry on error (call handler multiple times):

def wrap_tool_call(self, request, handler):  for attempt in range(3):  try:  result = handler(request)  if is_valid(result):  return result  except Exception:  if attempt == 2:  raise  return result

Conditional retry based on response:

def wrap_tool_call(self, request, handler):  for attempt in range(3):  result = handler(request)  if isinstance(result, ToolMessage) and result.status != "error":  return result  if attempt < 2:  continue  return result

awrap_tool_call `async` ¶

awrap_tool_call(  request: ToolCallRequest,  handler: Callable[[ToolCallRequest], Awaitable[ToolMessage | Command]], ) -> ToolMessage | Command

Intercept and control async tool execution via handler callback.

The handler callback executes the tool call and returns a ToolMessage or Command. Middleware can call the handler multiple times for retry logic, skip calling it to short-circuit, or modify the request/response. Multiple middleware compose with first in list as outermost layer.

PARAMETER	DESCRIPTION
`request`	Tool call request with call `dict`, `BaseTool`, state, and runtime. Access state via `request.state` and runtime via `request.runtime`. TYPE: `ToolCallRequest`
`handler`	Async callable to execute the tool and returns `ToolMessage` or `Command`. Call this to execute the tool. Can be called multiple times for retry logic. Can skip calling it to short-circuit. TYPE: `Callable[[ToolCallRequest], Awaitable[ToolMessage \| Command]]`

RETURNS	DESCRIPTION
`ToolMessage \| Command`	`ToolMessage` or `Command` (the final result).

The handler callable can be invoked multiple times for retry logic. Each call to handler is independent and stateless.

Examples:

Async retry on error:

async def awrap_tool_call(self, request, handler):  for attempt in range(3):  try:  result = await handler(request)  if is_valid(result):  return result  except Exception:  if attempt == 2:  raise  return result

async def awrap_tool_call(self, request, handler):  if cached := await get_cache_async(request):  return ToolMessage(content=cached, tool_call_id=request.tool_call["id"])  result = await handler(request)  await save_cache_async(request, result)  return result

init ¶

__init__(  pii_type: Literal["email", "credit_card", "ip", "mac_address", "url"] | str,  *,  strategy: Literal["block", "redact", "mask", "hash"] = "redact",  detector: Callable[[str], list[PIIMatch]] | str | None = None,  apply_to_input: bool = True,  apply_to_output: bool = False,  apply_to_tool_results: bool = False, ) -> None

Initialize the PII detection middleware.

PARAMETER	DESCRIPTION
`pii_type`	Type of PII to detect. Can be a built-in type (`email`, `credit_card`, `ip`, `mac_address`, `url`) or a custom type name. TYPE: `Literal['email', 'credit_card', 'ip', 'mac_address', 'url'] \| str`
`strategy`	How to handle detected PII: `block`: Raise PIIDetectionError when PII is detected `redact`: Replace with `[REDACTED_TYPE]` placeholders `mask`: Partially mask PII (show last few characters) `hash`: Replace with deterministic hash (format: `<type_hash:digest>`) TYPE: `Literal['block', 'redact', 'mask', 'hash']` DEFAULT: `'redact'`
`detector`	Custom detector function or regex pattern. If `Callable`: Function that takes content string and returns list of PIIMatch objects If `str`: Regex pattern to match PII If `None`: Uses built-in detector for the pii_type TYPE: `Callable[[str], list[PIIMatch]] \| str \| None` DEFAULT: `None`
`apply_to_input`	Whether to check user messages before model call. TYPE: `bool` DEFAULT: `True`
`apply_to_output`	Whether to check AI messages after model call. TYPE: `bool` DEFAULT: `False`
`apply_to_tool_results`	Whether to check tool result messages after tool execution. TYPE: `bool` DEFAULT: `False`

RAISES	DESCRIPTION
`ValueError`	If pii_type is not built-in and no detector is provided.

name `property` ¶

name: str

Name of the middleware.

before_model ¶

before_model(state: AgentState, runtime: Runtime) -> dict[str, Any] | None

Check user messages and tool results for PII before model invocation.

PARAMETER	DESCRIPTION
`state`	The current agent state. TYPE: `AgentState`
`runtime`	The langgraph runtime. TYPE: `Runtime`

RETURNS	DESCRIPTION
`dict[str, Any] \| None`	Updated state with PII handled according to strategy, or None if no PII detected.

RAISES	DESCRIPTION
`PIIDetectionError`	If PII is detected and strategy is "block".

after_model ¶

after_model(state: AgentState, runtime: Runtime) -> dict[str, Any] | None

Check AI messages for PII after model invocation.

PARAMETER	DESCRIPTION
`state`	The current agent state. TYPE: `AgentState`
`runtime`	The langgraph runtime. TYPE: `Runtime`

RETURNS	DESCRIPTION
`dict[str, Any] \| None`	Updated state with PII handled according to strategy, or None if no PII detected.

RAISES	DESCRIPTION
`PIIDetectionError`	If PII is detected and strategy is "block".

PIIDetectionError ¶

Bases: Exception

Raised when configured to block on detected sensitive values.

init ¶

__init__(pii_type: str, matches: Sequence[PIIMatch]) -> None

Initialize the exception with match context.

PARAMETER	DESCRIPTION
`pii_type`	Name of the detected sensitive type. TYPE: `str`
`matches`	All matches that were detected for that type. TYPE: `Sequence[PIIMatch]`

SummarizationMiddleware ¶

Bases: AgentMiddleware

Summarizes conversation history when token limits are approached.

This middleware monitors message token counts and automatically summarizes older messages when a threshold is reached, preserving recent messages and maintaining context continuity by ensuring AI/Tool message pairs remain together.

state_schema `class-attribute` `instance-attribute` ¶

state_schema: type[StateT] = cast('type[StateT]', AgentState)

The schema for state passed to the middleware nodes.

tools `instance-attribute` ¶

tools: list[BaseTool]

Additional tools registered by the middleware.

name `property` ¶

name: str

The name of the middleware instance.

Defaults to the class name, but can be overridden for custom naming.

before_agent ¶

before_agent(state: StateT, runtime: Runtime[ContextT]) -> dict[str, Any] | None

Logic to run before the agent execution starts.

abefore_agent `async` ¶

abefore_agent(state: StateT, runtime: Runtime[ContextT]) -> dict[str, Any] | None

Async logic to run before the agent execution starts.

abefore_model `async` ¶

abefore_model(state: StateT, runtime: Runtime[ContextT]) -> dict[str, Any] | None

Async logic to run before the model is called.

after_model ¶

after_model(state: StateT, runtime: Runtime[ContextT]) -> dict[str, Any] | None

Logic to run after the model is called.

aafter_model `async` ¶

aafter_model(state: StateT, runtime: Runtime[ContextT]) -> dict[str, Any] | None

Async logic to run after the model is called.

wrap_model_call ¶

wrap_model_call(  request: ModelRequest, handler: Callable[[ModelRequest], ModelResponse] ) -> ModelCallResult

Intercept and control model execution via handler callback.

The handler callback executes the model request and returns a ModelResponse. Middleware can call the handler multiple times for retry logic, skip calling it to short-circuit, or modify the request/response. Multiple middleware compose with first in list as outermost layer.

PARAMETER	DESCRIPTION
`request`	Model request to execute (includes state and runtime). TYPE: `ModelRequest`
`handler`	Callback that executes the model request and returns `ModelResponse`. Call this to execute the model. Can be called multiple times for retry logic. Can skip calling it to short-circuit. TYPE: `Callable[[ModelRequest], ModelResponse]`

RETURNS	DESCRIPTION
`ModelCallResult`	`ModelCallResult`

Examples:

Retry on error:

def wrap_model_call(self, request, handler):  for attempt in range(3):  try:  return handler(request)  except Exception:  if attempt == 2:  raise

Rewrite response:

def wrap_model_call(self, request, handler):  response = handler(request)  ai_msg = response.result[0]  return ModelResponse(  result=[AIMessage(content=f"[{ai_msg.content}]")],  structured_response=response.structured_response,  )

Error to fallback:

def wrap_model_call(self, request, handler):  try:  return handler(request)  except Exception:  return ModelResponse(result=[AIMessage(content="Service unavailable")])

Cache/short-circuit:

def wrap_model_call(self, request, handler):  if cached := get_cache(request):  return cached # Short-circuit with cached result  response = handler(request)  save_cache(request, response)  return response

Simple AIMessage return (converted automatically):

def wrap_model_call(self, request, handler):  response = handler(request)  # Can return AIMessage directly for simple cases  return AIMessage(content="Simplified response")

awrap_model_call `async` ¶

awrap_model_call(  request: ModelRequest, handler: Callable[[ModelRequest], Awaitable[ModelResponse]] ) -> ModelCallResult

Intercept and control async model execution via handler callback.

The handler callback executes the model request and returns a ModelResponse. Middleware can call the handler multiple times for retry logic, skip calling it to short-circuit, or modify the request/response. Multiple middleware compose with first in list as outermost layer.

PARAMETER	DESCRIPTION
`request`	Model request to execute (includes state and runtime). TYPE: `ModelRequest`
`handler`	Async callback that executes the model request and returns `ModelResponse`. Call this to execute the model. Can be called multiple times for retry logic. Can skip calling it to short-circuit. TYPE: `Callable[[ModelRequest], Awaitable[ModelResponse]]`

RETURNS	DESCRIPTION
`ModelCallResult`	ModelCallResult

Examples:

Retry on error:

async def awrap_model_call(self, request, handler):  for attempt in range(3):  try:  return await handler(request)  except Exception:  if attempt == 2:  raise

after_agent ¶

after_agent(state: StateT, runtime: Runtime[ContextT]) -> dict[str, Any] | None

Logic to run after the agent execution completes.

aafter_agent `async` ¶

aafter_agent(state: StateT, runtime: Runtime[ContextT]) -> dict[str, Any] | None

Async logic to run after the agent execution completes.

wrap_tool_call ¶

wrap_tool_call(  request: ToolCallRequest,  handler: Callable[[ToolCallRequest], ToolMessage | Command], ) -> ToolMessage | Command

Intercept tool execution for retries, monitoring, or modification.

Multiple middleware compose automatically (first defined = outermost). Exceptions propagate unless handle_tool_errors is configured on ToolNode.

PARAMETER	DESCRIPTION
`request`	Tool call request with call `dict`, `BaseTool`, state, and runtime. Access state via `request.state` and runtime via `request.runtime`. TYPE: `ToolCallRequest`
`handler`	Callable to execute the tool (can be called multiple times). TYPE: `Callable[[ToolCallRequest], ToolMessage \| Command]`

RETURNS	DESCRIPTION
`ToolMessage \| Command`	`ToolMessage` or `Command` (the final result).

The handler callable can be invoked multiple times for retry logic. Each call to handler is independent and stateless.

Examples:

Modify request before execution:

def wrap_tool_call(self, request, handler):  request.tool_call["args"]["value"] *= 2  return handler(request)

Retry on error (call handler multiple times):

def wrap_tool_call(self, request, handler):  for attempt in range(3):  try:  result = handler(request)  if is_valid(result):  return result  except Exception:  if attempt == 2:  raise  return result

Conditional retry based on response:

def wrap_tool_call(self, request, handler):  for attempt in range(3):  result = handler(request)  if isinstance(result, ToolMessage) and result.status != "error":  return result  if attempt < 2:  continue  return result

awrap_tool_call `async` ¶

awrap_tool_call(  request: ToolCallRequest,  handler: Callable[[ToolCallRequest], Awaitable[ToolMessage | Command]], ) -> ToolMessage | Command

Intercept and control async tool execution via handler callback.

The handler callback executes the tool call and returns a ToolMessage or Command. Middleware can call the handler multiple times for retry logic, skip calling it to short-circuit, or modify the request/response. Multiple middleware compose with first in list as outermost layer.

PARAMETER	DESCRIPTION
`request`	Tool call request with call `dict`, `BaseTool`, state, and runtime. Access state via `request.state` and runtime via `request.runtime`. TYPE: `ToolCallRequest`
`handler`	Async callable to execute the tool and returns `ToolMessage` or `Command`. Call this to execute the tool. Can be called multiple times for retry logic. Can skip calling it to short-circuit. TYPE: `Callable[[ToolCallRequest], Awaitable[ToolMessage \| Command]]`

RETURNS	DESCRIPTION
`ToolMessage \| Command`	`ToolMessage` or `Command` (the final result).

The handler callable can be invoked multiple times for retry logic. Each call to handler is independent and stateless.

Examples:

Async retry on error:

async def awrap_tool_call(self, request, handler):  for attempt in range(3):  try:  result = await handler(request)  if is_valid(result):  return result  except Exception:  if attempt == 2:  raise  return result

async def awrap_tool_call(self, request, handler):  if cached := await get_cache_async(request):  return ToolMessage(content=cached, tool_call_id=request.tool_call["id"])  result = await handler(request)  await save_cache_async(request, result)  return result

init ¶

__init__(  model: str | BaseChatModel,  max_tokens_before_summary: int | None = None,  messages_to_keep: int = _DEFAULT_MESSAGES_TO_KEEP,  token_counter: TokenCounter = count_tokens_approximately,  summary_prompt: str = DEFAULT_SUMMARY_PROMPT,  summary_prefix: str = SUMMARY_PREFIX, ) -> None

Initialize the summarization middleware.

PARAMETER	DESCRIPTION
`model`	The language model to use for generating summaries. TYPE: `str \| BaseChatModel`
`max_tokens_before_summary`	Token threshold to trigger summarization. If `None`, summarization is disabled. TYPE: `int \| None` DEFAULT: `None`
`messages_to_keep`	Number of recent messages to preserve after summarization. TYPE: `int` DEFAULT: `_DEFAULT_MESSAGES_TO_KEEP`
`token_counter`	Function to count tokens in messages. TYPE: `TokenCounter` DEFAULT: `count_tokens_approximately`
`summary_prompt`	Prompt template for generating summaries. TYPE: `str` DEFAULT: `DEFAULT_SUMMARY_PROMPT`
`summary_prefix`	Prefix added to system message when including summary. TYPE: `str` DEFAULT: `SUMMARY_PREFIX`

before_model ¶

before_model(state: AgentState, runtime: Runtime) -> dict[str, Any] | None

Process messages before model invocation, potentially triggering summarization.

ToolCallLimitMiddleware ¶

Bases: AgentMiddleware[ToolCallLimitState, Any]

Tracks tool call counts and enforces limits.

This middleware monitors the number of tool calls made during agent execution and can terminate the agent when specified limits are reached. It supports both thread-level and run-level call counting with configurable exit behaviors.

Thread-level: The middleware tracks the total number of tool calls and persists call count across multiple runs (invocations) of the agent.

Run-level: The middleware tracks the number of tool calls made during a single run (invocation) of the agent.

Example

from langchain.agents.middleware.tool_call_limit import ToolCallLimitMiddleware from langchain.agents import create_agent  # Limit all tool calls globally global_limiter = ToolCallLimitMiddleware(thread_limit=20, run_limit=10, exit_behavior="end")  # Limit a specific tool search_limiter = ToolCallLimitMiddleware(  tool_name="search", thread_limit=5, run_limit=3, exit_behavior="end" )  # Use both in the same agent agent = create_agent("openai:gpt-4o", middleware=[global_limiter, search_limiter])  result = await agent.invoke({"messages": [HumanMessage("Help me with a task")]})

tools `instance-attribute` ¶

tools: list[BaseTool]

Additional tools registered by the middleware.

before_agent ¶

before_agent(state: StateT, runtime: Runtime[ContextT]) -> dict[str, Any] | None

Logic to run before the agent execution starts.

abefore_agent `async` ¶

abefore_agent(state: StateT, runtime: Runtime[ContextT]) -> dict[str, Any] | None

Async logic to run before the agent execution starts.

abefore_model `async` ¶

abefore_model(state: StateT, runtime: Runtime[ContextT]) -> dict[str, Any] | None

Async logic to run before the model is called.

aafter_model `async` ¶

aafter_model(state: StateT, runtime: Runtime[ContextT]) -> dict[str, Any] | None

Async logic to run after the model is called.

wrap_model_call ¶

wrap_model_call(  request: ModelRequest, handler: Callable[[ModelRequest], ModelResponse] ) -> ModelCallResult

Intercept and control model execution via handler callback.

The handler callback executes the model request and returns a ModelResponse. Middleware can call the handler multiple times for retry logic, skip calling it to short-circuit, or modify the request/response. Multiple middleware compose with first in list as outermost layer.

PARAMETER	DESCRIPTION
`request`	Model request to execute (includes state and runtime). TYPE: `ModelRequest`
`handler`	Callback that executes the model request and returns `ModelResponse`. Call this to execute the model. Can be called multiple times for retry logic. Can skip calling it to short-circuit. TYPE: `Callable[[ModelRequest], ModelResponse]`

RETURNS	DESCRIPTION
`ModelCallResult`	`ModelCallResult`

Examples:

Retry on error:

def wrap_model_call(self, request, handler):  for attempt in range(3):  try:  return handler(request)  except Exception:  if attempt == 2:  raise

Rewrite response:

def wrap_model_call(self, request, handler):  response = handler(request)  ai_msg = response.result[0]  return ModelResponse(  result=[AIMessage(content=f"[{ai_msg.content}]")],  structured_response=response.structured_response,  )

Error to fallback:

def wrap_model_call(self, request, handler):  try:  return handler(request)  except Exception:  return ModelResponse(result=[AIMessage(content="Service unavailable")])

Cache/short-circuit:

def wrap_model_call(self, request, handler):  if cached := get_cache(request):  return cached # Short-circuit with cached result  response = handler(request)  save_cache(request, response)  return response

Simple AIMessage return (converted automatically):

def wrap_model_call(self, request, handler):  response = handler(request)  # Can return AIMessage directly for simple cases  return AIMessage(content="Simplified response")

awrap_model_call `async` ¶

awrap_model_call(  request: ModelRequest, handler: Callable[[ModelRequest], Awaitable[ModelResponse]] ) -> ModelCallResult

Intercept and control async model execution via handler callback.

The handler callback executes the model request and returns a ModelResponse. Middleware can call the handler multiple times for retry logic, skip calling it to short-circuit, or modify the request/response. Multiple middleware compose with first in list as outermost layer.

PARAMETER	DESCRIPTION
`request`	Model request to execute (includes state and runtime). TYPE: `ModelRequest`
`handler`	Async callback that executes the model request and returns `ModelResponse`. Call this to execute the model. Can be called multiple times for retry logic. Can skip calling it to short-circuit. TYPE: `Callable[[ModelRequest], Awaitable[ModelResponse]]`

RETURNS	DESCRIPTION
`ModelCallResult`	ModelCallResult

Examples:

Retry on error:

async def awrap_model_call(self, request, handler):  for attempt in range(3):  try:  return await handler(request)  except Exception:  if attempt == 2:  raise

after_agent ¶

after_agent(state: StateT, runtime: Runtime[ContextT]) -> dict[str, Any] | None

Logic to run after the agent execution completes.

aafter_agent `async` ¶

aafter_agent(state: StateT, runtime: Runtime[ContextT]) -> dict[str, Any] | None

Async logic to run after the agent execution completes.

wrap_tool_call ¶

wrap_tool_call(  request: ToolCallRequest,  handler: Callable[[ToolCallRequest], ToolMessage | Command], ) -> ToolMessage | Command

Intercept tool execution for retries, monitoring, or modification.

Multiple middleware compose automatically (first defined = outermost). Exceptions propagate unless handle_tool_errors is configured on ToolNode.

PARAMETER	DESCRIPTION
`request`	Tool call request with call `dict`, `BaseTool`, state, and runtime. Access state via `request.state` and runtime via `request.runtime`. TYPE: `ToolCallRequest`
`handler`	Callable to execute the tool (can be called multiple times). TYPE: `Callable[[ToolCallRequest], ToolMessage \| Command]`

RETURNS	DESCRIPTION
`ToolMessage \| Command`	`ToolMessage` or `Command` (the final result).

The handler callable can be invoked multiple times for retry logic. Each call to handler is independent and stateless.

Examples:

Modify request before execution:

def wrap_tool_call(self, request, handler):  request.tool_call["args"]["value"] *= 2  return handler(request)

Retry on error (call handler multiple times):

def wrap_tool_call(self, request, handler):  for attempt in range(3):  try:  result = handler(request)  if is_valid(result):  return result  except Exception:  if attempt == 2:  raise  return result

Conditional retry based on response:

def wrap_tool_call(self, request, handler):  for attempt in range(3):  result = handler(request)  if isinstance(result, ToolMessage) and result.status != "error":  return result  if attempt < 2:  continue  return result

awrap_tool_call `async` ¶

awrap_tool_call(  request: ToolCallRequest,  handler: Callable[[ToolCallRequest], Awaitable[ToolMessage | Command]], ) -> ToolMessage | Command

Intercept and control async tool execution via handler callback.

The handler callback executes the tool call and returns a ToolMessage or Command. Middleware can call the handler multiple times for retry logic, skip calling it to short-circuit, or modify the request/response. Multiple middleware compose with first in list as outermost layer.

PARAMETER	DESCRIPTION
`request`	Tool call request with call `dict`, `BaseTool`, state, and runtime. Access state via `request.state` and runtime via `request.runtime`. TYPE: `ToolCallRequest`
`handler`	Async callable to execute the tool and returns `ToolMessage` or `Command`. Call this to execute the tool. Can be called multiple times for retry logic. Can skip calling it to short-circuit. TYPE: `Callable[[ToolCallRequest], Awaitable[ToolMessage \| Command]]`

RETURNS	DESCRIPTION
`ToolMessage \| Command`	`ToolMessage` or `Command` (the final result).

The handler callable can be invoked multiple times for retry logic. Each call to handler is independent and stateless.

Examples:

Async retry on error:

async def awrap_tool_call(self, request, handler):  for attempt in range(3):  try:  result = await handler(request)  if is_valid(result):  return result  except Exception:  if attempt == 2:  raise  return result

async def awrap_tool_call(self, request, handler):  if cached := await get_cache_async(request):  return ToolMessage(content=cached, tool_call_id=request.tool_call["id"])  result = await handler(request)  await save_cache_async(request, result)  return result

state_schema `class-attribute` `instance-attribute` ¶

state_schema = ToolCallLimitState

The schema for state passed to the middleware nodes.

init ¶

__init__(  *,  tool_name: str | None = None,  thread_limit: int | None = None,  run_limit: int | None = None,  exit_behavior: Literal["end", "error"] = "end", ) -> None

Initialize the tool call limit middleware.

PARAMETER	DESCRIPTION
`tool_name`	Name of the specific tool to limit. If `None`, limits apply to all tools. Defaults to `None`. TYPE: `str \| None` DEFAULT: `None`
`thread_limit`	Maximum number of tool calls allowed per thread. None means no limit. Defaults to `None`. TYPE: `int \| None` DEFAULT: `None`
`run_limit`	Maximum number of tool calls allowed per run. None means no limit. Defaults to `None`. TYPE: `int \| None` DEFAULT: `None`
`exit_behavior`	What to do when limits are exceeded. - "end": Jump to the end of the agent execution and inject an artificial AI message indicating that the limit was exceeded. - "error": Raise a ToolCallLimitExceededError Defaults to "end". TYPE: `Literal['end', 'error']` DEFAULT: `'end'`

RAISES	DESCRIPTION
`ValueError`	If both limits are `None` or if `exit_behavior` is invalid.

name `property` ¶

name: str

The name of the middleware instance.

Includes the tool name if specified to allow multiple instances of this middleware with different tool names.

before_model ¶

before_model(state: ToolCallLimitState, runtime: Runtime) -> dict[str, Any] | None

Check tool call limits before making a model call.

PARAMETER	DESCRIPTION
`state`	The current agent state containing tool call counts. TYPE: `ToolCallLimitState`
`runtime`	The langgraph runtime. TYPE: `Runtime`

RETURNS	DESCRIPTION
`dict[str, Any] \| None`	If limits are exceeded and exit_behavior is "end", returns
`dict[str, Any] \| None`	a Command to jump to the end with a limit exceeded message. Otherwise returns None.

RAISES	DESCRIPTION
`ToolCallLimitExceededError`	If limits are exceeded and exit_behavior is "error".

after_model ¶

after_model(state: ToolCallLimitState, runtime: Runtime) -> dict[str, Any] | None

Increment tool call counts after a model call (when tool calls are made).

PARAMETER	DESCRIPTION
`state`	The current agent state. TYPE: `ToolCallLimitState`
`runtime`	The langgraph runtime. TYPE: `Runtime`

RETURNS	DESCRIPTION
`dict[str, Any] \| None`	State updates with incremented tool call counts if tool calls were made.

AgentMiddleware ¶

Bases: Generic[StateT, ContextT]

Base middleware class for an agent.

Subclass this and implement any of the defined methods to customize agent behavior between steps in the main agent loop.

state_schema `class-attribute` `instance-attribute` ¶

state_schema: type[StateT] = cast('type[StateT]', AgentState)

The schema for state passed to the middleware nodes.

tools `instance-attribute` ¶

tools: list[BaseTool]

Additional tools registered by the middleware.

name `property` ¶

name: str

The name of the middleware instance.

Defaults to the class name, but can be overridden for custom naming.

before_agent ¶

before_agent(state: StateT, runtime: Runtime[ContextT]) -> dict[str, Any] | None

Logic to run before the agent execution starts.

abefore_agent `async` ¶

abefore_agent(state: StateT, runtime: Runtime[ContextT]) -> dict[str, Any] | None

Async logic to run before the agent execution starts.

before_model ¶

before_model(state: StateT, runtime: Runtime[ContextT]) -> dict[str, Any] | None

Logic to run before the model is called.

abefore_model `async` ¶

abefore_model(state: StateT, runtime: Runtime[ContextT]) -> dict[str, Any] | None

Async logic to run before the model is called.

after_model ¶

after_model(state: StateT, runtime: Runtime[ContextT]) -> dict[str, Any] | None

Logic to run after the model is called.

aafter_model `async` ¶

aafter_model(state: StateT, runtime: Runtime[ContextT]) -> dict[str, Any] | None

Async logic to run after the model is called.

wrap_model_call ¶

wrap_model_call(  request: ModelRequest, handler: Callable[[ModelRequest], ModelResponse] ) -> ModelCallResult

Intercept and control model execution via handler callback.

The handler callback executes the model request and returns a ModelResponse. Middleware can call the handler multiple times for retry logic, skip calling it to short-circuit, or modify the request/response. Multiple middleware compose with first in list as outermost layer.

PARAMETER	DESCRIPTION
`request`	Model request to execute (includes state and runtime). TYPE: `ModelRequest`
`handler`	Callback that executes the model request and returns `ModelResponse`. Call this to execute the model. Can be called multiple times for retry logic. Can skip calling it to short-circuit. TYPE: `Callable[[ModelRequest], ModelResponse]`

RETURNS	DESCRIPTION
`ModelCallResult`	`ModelCallResult`

Examples:

Retry on error:

def wrap_model_call(self, request, handler):  for attempt in range(3):  try:  return handler(request)  except Exception:  if attempt == 2:  raise

Rewrite response:

def wrap_model_call(self, request, handler):  response = handler(request)  ai_msg = response.result[0]  return ModelResponse(  result=[AIMessage(content=f"[{ai_msg.content}]")],  structured_response=response.structured_response,  )

Error to fallback:

def wrap_model_call(self, request, handler):  try:  return handler(request)  except Exception:  return ModelResponse(result=[AIMessage(content="Service unavailable")])

Cache/short-circuit:

def wrap_model_call(self, request, handler):  if cached := get_cache(request):  return cached # Short-circuit with cached result  response = handler(request)  save_cache(request, response)  return response

Simple AIMessage return (converted automatically):

def wrap_model_call(self, request, handler):  response = handler(request)  # Can return AIMessage directly for simple cases  return AIMessage(content="Simplified response")

awrap_model_call `async` ¶

awrap_model_call(  request: ModelRequest, handler: Callable[[ModelRequest], Awaitable[ModelResponse]] ) -> ModelCallResult

Intercept and control async model execution via handler callback.

The handler callback executes the model request and returns a ModelResponse. Middleware can call the handler multiple times for retry logic, skip calling it to short-circuit, or modify the request/response. Multiple middleware compose with first in list as outermost layer.

PARAMETER	DESCRIPTION
`request`	Model request to execute (includes state and runtime). TYPE: `ModelRequest`
`handler`	Async callback that executes the model request and returns `ModelResponse`. Call this to execute the model. Can be called multiple times for retry logic. Can skip calling it to short-circuit. TYPE: `Callable[[ModelRequest], Awaitable[ModelResponse]]`

RETURNS	DESCRIPTION
`ModelCallResult`	ModelCallResult

Examples:

Retry on error:

async def awrap_model_call(self, request, handler):  for attempt in range(3):  try:  return await handler(request)  except Exception:  if attempt == 2:  raise

after_agent ¶

after_agent(state: StateT, runtime: Runtime[ContextT]) -> dict[str, Any] | None

Logic to run after the agent execution completes.

aafter_agent `async` ¶

aafter_agent(state: StateT, runtime: Runtime[ContextT]) -> dict[str, Any] | None

Async logic to run after the agent execution completes.

wrap_tool_call ¶

wrap_tool_call(  request: ToolCallRequest,  handler: Callable[[ToolCallRequest], ToolMessage | Command], ) -> ToolMessage | Command

Intercept tool execution for retries, monitoring, or modification.

Multiple middleware compose automatically (first defined = outermost). Exceptions propagate unless handle_tool_errors is configured on ToolNode.

PARAMETER	DESCRIPTION
`request`	Tool call request with call `dict`, `BaseTool`, state, and runtime. Access state via `request.state` and runtime via `request.runtime`. TYPE: `ToolCallRequest`
`handler`	Callable to execute the tool (can be called multiple times). TYPE: `Callable[[ToolCallRequest], ToolMessage \| Command]`

RETURNS	DESCRIPTION
`ToolMessage \| Command`	`ToolMessage` or `Command` (the final result).

The handler callable can be invoked multiple times for retry logic. Each call to handler is independent and stateless.

Examples:

Modify request before execution:

def wrap_tool_call(self, request, handler):  request.tool_call["args"]["value"] *= 2  return handler(request)

Retry on error (call handler multiple times):

def wrap_tool_call(self, request, handler):  for attempt in range(3):  try:  result = handler(request)  if is_valid(result):  return result  except Exception:  if attempt == 2:  raise  return result

Conditional retry based on response:

def wrap_tool_call(self, request, handler):  for attempt in range(3):  result = handler(request)  if isinstance(result, ToolMessage) and result.status != "error":  return result  if attempt < 2:  continue  return result

awrap_tool_call `async` ¶

awrap_tool_call(  request: ToolCallRequest,  handler: Callable[[ToolCallRequest], Awaitable[ToolMessage | Command]], ) -> ToolMessage | Command

Intercept and control async tool execution via handler callback.

The handler callback executes the tool call and returns a ToolMessage or Command. Middleware can call the handler multiple times for retry logic, skip calling it to short-circuit, or modify the request/response. Multiple middleware compose with first in list as outermost layer.

PARAMETER	DESCRIPTION
`request`	Tool call request with call `dict`, `BaseTool`, state, and runtime. Access state via `request.state` and runtime via `request.runtime`. TYPE: `ToolCallRequest`
`handler`	Async callable to execute the tool and returns `ToolMessage` or `Command`. Call this to execute the tool. Can be called multiple times for retry logic. Can skip calling it to short-circuit. TYPE: `Callable[[ToolCallRequest], Awaitable[ToolMessage \| Command]]`

RETURNS	DESCRIPTION
`ToolMessage \| Command`	`ToolMessage` or `Command` (the final result).

The handler callable can be invoked multiple times for retry logic. Each call to handler is independent and stateless.

Examples:

Async retry on error:

async def awrap_tool_call(self, request, handler):  for attempt in range(3):  try:  result = await handler(request)  if is_valid(result):  return result  except Exception:  if attempt == 2:  raise  return result

async def awrap_tool_call(self, request, handler):  if cached := await get_cache_async(request):  return ToolMessage(content=cached, tool_call_id=request.tool_call["id"])  result = await handler(request)  await save_cache_async(request, result)  return result

AgentState ¶

Bases: TypedDict, Generic[ResponseT]

State schema for the agent.

ClearToolUsesEdit `dataclass` ¶

Bases: ContextEdit

Configuration for clearing tool outputs when token limits are exceeded.

trigger `class-attribute` `instance-attribute` ¶

trigger: int = 100000

Token count that triggers the edit.

clear_at_least `class-attribute` `instance-attribute` ¶

clear_at_least: int = 0

Minimum number of tokens to reclaim when the edit runs.

keep `class-attribute` `instance-attribute` ¶

keep: int = 3

Number of most recent tool results that must be preserved.

clear_tool_inputs `class-attribute` `instance-attribute` ¶

clear_tool_inputs: bool = False

Whether to clear the originating tool call parameters on the AI message.

exclude_tools `class-attribute` `instance-attribute` ¶

exclude_tools: Sequence[str] = ()

List of tool names to exclude from clearing.

placeholder `class-attribute` `instance-attribute` ¶

placeholder: str = DEFAULT_TOOL_PLACEHOLDER

Placeholder text inserted for cleared tool outputs.

apply ¶

apply(messages: list[AnyMessage], *, count_tokens: TokenCounter) -> None

Apply the clear-tool-uses strategy.

InterruptOnConfig ¶

Bases: TypedDict

Configuration for an action requiring human in the loop.

This is the configuration format used in the HumanInTheLoopMiddleware.__init__ method.

allowed_decisions `instance-attribute` ¶

allowed_decisions: list[DecisionType]

The decisions that are allowed for this action.

description `instance-attribute` ¶

description: NotRequired[str | _DescriptionFactory]

The description attached to the request for human input.

Can be either:

A static string describing the approval request
A callable that dynamically generates the description based on agent state, runtime, and tool call information

Example

# Static string description config = ToolConfig(  allowed_decisions=["approve", "reject"],  description="Please review this tool execution" )  # Dynamic callable description def format_tool_description(  tool_call: ToolCall,  state: AgentState,  runtime: Runtime ) -> str:  import json  return (  f"Tool: {tool_call['name']}\n"  f"Arguments:\n{json.dumps(tool_call['args'], indent=2)}"  )  config = InterruptOnConfig(  allowed_decisions=["approve", "edit", "reject"],  description=format_tool_description )

args_schema `instance-attribute` ¶

args_schema: NotRequired[dict[str, Any]]

JSON schema for the args associated with the action, if edits are allowed.

ModelRequest `dataclass` ¶

Model request information for the agent.

override ¶

override(**overrides: Unpack[_ModelRequestOverrides]) -> ModelRequest

Replace the request with a new request with the given overrides.

Returns a new ModelRequest instance with the specified attributes replaced. This follows an immutable pattern, leaving the original request unchanged.

PARAMETER	DESCRIPTION
`**overrides`	Keyword arguments for attributes to override. Supported keys: - model: BaseChatModel instance - system_prompt: Optional system prompt string - messages: List of messages - tool_choice: Tool choice configuration - tools: List of available tools - response_format: Response format specification - model_settings: Additional model settings TYPE: `Unpack[_ModelRequestOverrides]` DEFAULT: `{}`

RETURNS	DESCRIPTION
`ModelRequest`	New ModelRequest instance with specified overrides applied.

Examples:

# Create a new request with different model new_request = request.override(model=different_model)  # Override multiple attributes new_request = request.override(system_prompt="New instructions", tool_choice="auto")

ModelResponse `dataclass` ¶

Response from model execution including messages and optional structured output.

The result will usually contain a single AIMessage, but may include an additional ToolMessage if the model used a tool for structured output.

result `instance-attribute` ¶

result: list[BaseMessage]

List of messages from model execution.

structured_response `class-attribute` `instance-attribute` ¶

structured_response: Any = None

Parsed structured output if response_format was specified, None otherwise.

before_model ¶

before_model(  func: _CallableWithStateAndRuntime[StateT, ContextT] | None = None,  *,  state_schema: type[StateT] | None = None,  tools: list[BaseTool] | None = None,  can_jump_to: list[JumpTo] | None = None,  name: str | None = None, ) -> (  Callable[  [_CallableWithStateAndRuntime[StateT, ContextT]],  AgentMiddleware[StateT, ContextT],  ]  | AgentMiddleware[StateT, ContextT] )

Decorator used to dynamically create a middleware with the before_model hook.

PARAMETER	DESCRIPTION
`func`	The function to be decorated. Must accept: `state: StateT, runtime: Runtime[ContextT]` - State and runtime context TYPE: `_CallableWithStateAndRuntime[StateT, ContextT] \| None` DEFAULT: `None`
`state_schema`	Optional custom state schema type. If not provided, uses the default `AgentState` schema. TYPE: `type[StateT] \| None` DEFAULT: `None`
`tools`	Optional list of additional tools to register with this middleware. TYPE: `list[BaseTool] \| None` DEFAULT: `None`
`can_jump_to`	Optional list of valid jump destinations for conditional edges. Valid values are: `"tools"`, `"model"`, `"end"` TYPE: `list[JumpTo] \| None` DEFAULT: `None`
`name`	Optional name for the generated middleware class. If not provided, uses the decorated function's name. TYPE: `str \| None` DEFAULT: `None`

RETURNS	DESCRIPTION
`Callable[[_CallableWithStateAndRuntime[StateT, ContextT]], AgentMiddleware[StateT, ContextT]] \| AgentMiddleware[StateT, ContextT]`	Either an `AgentMiddleware` instance (if func is provided directly) or a
`Callable[[_CallableWithStateAndRuntime[StateT, ContextT]], AgentMiddleware[StateT, ContextT]] \| AgentMiddleware[StateT, ContextT]`	decorator function that can be applied to a function it is wrapping.

The decorated function should return

dict[str, Any] - State updates to merge into the agent state
Command - A command to control flow (e.g., jump to different node)
None - No state updates or flow control

Examples:

Basic usage:

@before_model def log_before_model(state: AgentState, runtime: Runtime) -> None:  print(f"About to call model with {len(state['messages'])} messages")

With conditional jumping:

@before_model(can_jump_to=["end"]) def conditional_before_model(state: AgentState, runtime: Runtime) -> dict[str, Any] | None:  if some_condition(state):  return {"jump_to": "end"}  return None

With custom state schema:

@before_model(state_schema=MyCustomState) def custom_before_model(state: MyCustomState, runtime: Runtime) -> dict[str, Any]:  return {"custom_field": "updated_value"}

after_model ¶

after_model(  func: _CallableWithStateAndRuntime[StateT, ContextT] | None = None,  *,  state_schema: type[StateT] | None = None,  tools: list[BaseTool] | None = None,  can_jump_to: list[JumpTo] | None = None,  name: str | None = None, ) -> (  Callable[  [_CallableWithStateAndRuntime[StateT, ContextT]],  AgentMiddleware[StateT, ContextT],  ]  | AgentMiddleware[StateT, ContextT] )

Decorator used to dynamically create a middleware with the after_model hook.

PARAMETER	DESCRIPTION
`func`	The function to be decorated. Must accept: `state: StateT, runtime: Runtime[ContextT]` - State and runtime context TYPE: `_CallableWithStateAndRuntime[StateT, ContextT] \| None` DEFAULT: `None`
`state_schema`	Optional custom state schema type. If not provided, uses the default `AgentState` schema. TYPE: `type[StateT] \| None` DEFAULT: `None`
`tools`	Optional list of additional tools to register with this middleware. TYPE: `list[BaseTool] \| None` DEFAULT: `None`
`can_jump_to`	Optional list of valid jump destinations for conditional edges. Valid values are: `"tools"`, `"model"`, `"end"` TYPE: `list[JumpTo] \| None` DEFAULT: `None`
`name`	Optional name for the generated middleware class. If not provided, uses the decorated function's name. TYPE: `str \| None` DEFAULT: `None`

RETURNS	DESCRIPTION
`Callable[[_CallableWithStateAndRuntime[StateT, ContextT]], AgentMiddleware[StateT, ContextT]] \| AgentMiddleware[StateT, ContextT]`	Either an `AgentMiddleware` instance (if func is provided) or a decorator
`Callable[[_CallableWithStateAndRuntime[StateT, ContextT]], AgentMiddleware[StateT, ContextT]] \| AgentMiddleware[StateT, ContextT]`	function that can be applied to a function.

The decorated function should return

dict[str, Any] - State updates to merge into the agent state
Command - A command to control flow (e.g., jump to different node)
None - No state updates or flow control

Examples:

Basic usage for logging model responses:

@after_model def log_latest_message(state: AgentState, runtime: Runtime) -> None:  print(state["messages"][-1].content)

With custom state schema:

@after_model(state_schema=MyCustomState, name="MyAfterModelMiddleware") def custom_after_model(state: MyCustomState, runtime: Runtime) -> dict[str, Any]:  return {"custom_field": "updated_after_model"}

wrap_model_call ¶

wrap_model_call(  func: _CallableReturningModelResponse[StateT, ContextT] | None = None,  *,  state_schema: type[StateT] | None = None,  tools: list[BaseTool] | None = None,  name: str | None = None, ) -> (  Callable[  [_CallableReturningModelResponse[StateT, ContextT]],  AgentMiddleware[StateT, ContextT],  ]  | AgentMiddleware[StateT, ContextT] )

Create middleware with wrap_model_call hook from a function.

Converts a function with handler callback into middleware that can intercept model calls, implement retry logic, handle errors, and rewrite responses.

PARAMETER	DESCRIPTION
`func`	Function accepting (request, handler) that calls handler(request) to execute the model and returns `ModelResponse` or `AIMessage`. Request contains state and runtime. TYPE: `_CallableReturningModelResponse[StateT, ContextT] \| None` DEFAULT: `None`
`state_schema`	Custom state schema. Defaults to `AgentState`. TYPE: `type[StateT] \| None` DEFAULT: `None`
`tools`	Additional tools to register with this middleware. TYPE: `list[BaseTool] \| None` DEFAULT: `None`
`name`	Middleware class name. Defaults to function name. TYPE: `str \| None` DEFAULT: `None`

RETURNS	DESCRIPTION
`Callable[[_CallableReturningModelResponse[StateT, ContextT]], AgentMiddleware[StateT, ContextT]] \| AgentMiddleware[StateT, ContextT]`	`AgentMiddleware` instance if func provided, otherwise a decorator.

Examples:

Basic retry logic:

@wrap_model_call def retry_on_error(request, handler):  max_retries = 3  for attempt in range(max_retries):  try:  return handler(request)  except Exception:  if attempt == max_retries - 1:  raise

Model fallback:

@wrap_model_call def fallback_model(request, handler):  # Try primary model  try:  return handler(request)  except Exception:  pass   # Try fallback model  request.model = fallback_model_instance  return handler(request)

Rewrite response content (full ModelResponse):

@wrap_model_call def uppercase_responses(request, handler):  response = handler(request)  ai_msg = response.result[0]  return ModelResponse(  result=[AIMessage(content=ai_msg.content.upper())],  structured_response=response.structured_response,  )

Simple AIMessage return (converted automatically):

@wrap_model_call def simple_response(request, handler):  # AIMessage is automatically converted to ModelResponse  return AIMessage(content="Simple response")

wrap_tool_call ¶

wrap_tool_call(  func: _CallableReturningToolResponse | None = None,  *,  tools: list[BaseTool] | None = None,  name: str | None = None, ) -> Callable[[_CallableReturningToolResponse], AgentMiddleware] | AgentMiddleware

Create middleware with wrap_tool_call hook from a function.

Converts a function with handler callback into middleware that can intercept tool calls, implement retry logic, monitor execution, and modify responses.

PARAMETER	DESCRIPTION
`func`	Function accepting (request, handler) that calls handler(request) to execute the tool and returns final `ToolMessage` or `Command`. Can be sync or async. TYPE: `_CallableReturningToolResponse \| None` DEFAULT: `None`
`tools`	Additional tools to register with this middleware. TYPE: `list[BaseTool] \| None` DEFAULT: `None`
`name`	Middleware class name. Defaults to function name. TYPE: `str \| None` DEFAULT: `None`

RETURNS	DESCRIPTION
`Callable[[_CallableReturningToolResponse], AgentMiddleware] \| AgentMiddleware`	`AgentMiddleware` instance if func provided, otherwise a decorator.

Examples:

Retry logic:

@wrap_tool_call def retry_on_error(request, handler):  max_retries = 3  for attempt in range(max_retries):  try:  return handler(request)  except Exception:  if attempt == max_retries - 1:  raise

Async retry logic:

@wrap_tool_call async def async_retry(request, handler):  for attempt in range(3):  try:  return await handler(request)  except Exception:  if attempt == 2:  raise

Modify request:

@wrap_tool_call def modify_args(request, handler):  request.tool_call["args"]["value"] *= 2  return handler(request)

Short-circuit with cached result:

@wrap_tool_call def with_cache(request, handler):  if cached := get_cache(request):  return ToolMessage(content=cached, tool_call_id=request.tool_call["id"])  result = handler(request)  save_cache(request, result)  return result

ModelRequest `dataclass` ¶

Model request information for the agent.

override ¶

override(**overrides: Unpack[_ModelRequestOverrides]) -> ModelRequest

Replace the request with a new request with the given overrides.

Returns a new ModelRequest instance with the specified attributes replaced. This follows an immutable pattern, leaving the original request unchanged.

PARAMETER	DESCRIPTION
`**overrides`	Keyword arguments for attributes to override. Supported keys: - model: BaseChatModel instance - system_prompt: Optional system prompt string - messages: List of messages - tool_choice: Tool choice configuration - tools: List of available tools - response_format: Response format specification - model_settings: Additional model settings TYPE: `Unpack[_ModelRequestOverrides]` DEFAULT: `{}`

RETURNS	DESCRIPTION
`ModelRequest`	New ModelRequest instance with specified overrides applied.

Examples:

# Create a new request with different model new_request = request.override(model=different_model)  # Override multiple attributes new_request = request.override(system_prompt="New instructions", tool_choice="auto")

Middleware

langchain.agents.middleware ¶

ContextEditingMiddleware ¶

state_schema class-attribute instance-attribute ¶

tools instance-attribute ¶

name property ¶

before_agent ¶

abefore_agent async ¶

before_model ¶

abefore_model async ¶

after_model ¶

aafter_model async ¶

after_agent ¶

aafter_agent async ¶

wrap_tool_call ¶

awrap_tool_call async ¶

__init__ ¶

wrap_model_call ¶

awrap_model_call async ¶

HumanInTheLoopMiddleware ¶

state_schema class-attribute instance-attribute ¶

tools instance-attribute ¶

name property ¶

before_agent ¶

abefore_agent async ¶

before_model ¶

abefore_model async ¶

aafter_model async ¶

wrap_model_call ¶

awrap_model_call async ¶

after_agent ¶

aafter_agent async ¶

wrap_tool_call ¶

awrap_tool_call async ¶

__init__ ¶

after_model ¶

LLMToolSelectorMiddleware ¶

state_schema class-attribute instance-attribute ¶

tools instance-attribute ¶

name property ¶

before_agent ¶

abefore_agent async ¶

before_model ¶

abefore_model async ¶

after_model ¶

aafter_model async ¶

after_agent ¶

aafter_agent async ¶

wrap_tool_call ¶

awrap_tool_call async ¶

__init__ ¶

wrap_model_call ¶

awrap_model_call async ¶

LLMToolEmulator ¶

state_schema class-attribute instance-attribute ¶

tools instance-attribute ¶

name property ¶

before_agent ¶

abefore_agent async ¶

before_model ¶

abefore_model async ¶

after_model ¶

aafter_model async ¶

wrap_model_call ¶

awrap_model_call async ¶

after_agent ¶

aafter_agent async ¶

__init__ ¶

wrap_tool_call ¶

awrap_tool_call async ¶

ModelCallLimitMiddleware ¶

tools instance-attribute ¶

name property ¶

before_agent ¶

abefore_agent async ¶

abefore_model async ¶

aafter_model async ¶

wrap_model_call ¶

awrap_model_call async ¶

after_agent ¶

state_schema `class-attribute` `instance-attribute` ¶

tools `instance-attribute` ¶

name `property` ¶

abefore_agent `async` ¶

abefore_model `async` ¶

aafter_model `async` ¶

aafter_agent `async` ¶

awrap_tool_call `async` ¶

init ¶

awrap_model_call `async` ¶

state_schema `class-attribute` `instance-attribute` ¶

tools `instance-attribute` ¶

name `property` ¶

abefore_agent `async` ¶

abefore_model `async` ¶

aafter_model `async` ¶

awrap_model_call `async` ¶

aafter_agent `async` ¶

awrap_tool_call `async` ¶

init ¶

state_schema `class-attribute` `instance-attribute` ¶

tools `instance-attribute` ¶

name `property` ¶

abefore_agent `async` ¶

abefore_model `async` ¶

aafter_model `async` ¶

aafter_agent `async` ¶

awrap_tool_call `async` ¶

init ¶

awrap_model_call `async` ¶

state_schema `class-attribute` `instance-attribute` ¶

tools `instance-attribute` ¶

name `property` ¶

abefore_agent `async` ¶

abefore_model `async` ¶

aafter_model `async` ¶

awrap_model_call `async` ¶

aafter_agent `async` ¶

init ¶

awrap_tool_call `async` ¶

tools `instance-attribute` ¶

name `property` ¶

abefore_agent `async` ¶

abefore_model `async` ¶

aafter_model `async` ¶

awrap_model_call `async` ¶

aafter_agent `async` ¶

awrap_tool_call `async` ¶

state_schema `class-attribute` `instance-attribute` ¶

init ¶

state_schema `class-attribute` `instance-attribute` ¶

tools `instance-attribute` ¶

name `property` ¶

abefore_agent `async` ¶

abefore_model `async` ¶

aafter_model `async` ¶

aafter_agent `async` ¶

awrap_tool_call `async` ¶

init ¶

awrap_model_call `async` ¶

state_schema `class-attribute` `instance-attribute` ¶

tools `instance-attribute` ¶

abefore_agent `async` ¶

abefore_model `async` ¶

aafter_model `async` ¶

awrap_model_call `async` ¶

aafter_agent `async` ¶

awrap_tool_call `async` ¶

init ¶

name `property` ¶

init ¶

state_schema `class-attribute` `instance-attribute` ¶

tools `instance-attribute` ¶

name `property` ¶

abefore_agent `async` ¶

abefore_model `async` ¶

aafter_model `async` ¶

awrap_model_call `async` ¶

aafter_agent `async` ¶

awrap_tool_call `async` ¶