The tool search tool enables Claude to work with hundreds or thousands of tools by dynamically discovering and loading them on-demand. Instead of loading all tool definitions into the context window upfront, Claude searches your tool catalog—including tool names, descriptions, argument names, and argument descriptions—and loads only the tools it needs.
This approach solves two critical challenges as tool libraries scale:
Although this is provided as a server-side tool, you can also implement your own client-side tool search functionality. See Custom tool search implementation for details.
The tool search tool is currently in public beta. Include the appropriate beta header for your provider:
| Provider | Beta header | Supported models |
|---|---|---|
| Claude API Microsoft Foundry | advanced-tool-use-2025-11-20 | Claude Opus 4.5 Claude Sonnet 4.5 |
| Google Cloud's Vertex AI | tool-search-tool-2025-10-19 | Claude Opus 4.5 Claude Sonnet 4.5 |
| Amazon Bedrock | tool-search-tool-2025-10-19 | Claude Opus 4.5 |
On Amazon Bedrock, server-side tool search is available only via the invoke API, not the converse API.
You can also implement client-side tool search by returning tool_reference blocks from your own search implementation.
There are two tool search variants:
tool_search_tool_regex_20251119): Claude constructs regex patterns to search for toolstool_search_tool_bm25_20251119): Claude uses natural language queries to search for toolsWhen you enable the tool search tool:
tool_search_tool_regex_20251119 or tool_search_tool_bm25_20251119) in your tools listdefer_loading: true for tools that shouldn't be loaded immediatelytool_reference blocksThis keeps your context window efficient while maintaining high tool selection accuracy.
Here's a simple example with deferred tools:
curl https://api.anthropic.com/v1/messages \ --header "x-api-key: $ANTHROPIC_API_KEY" \ --header "anthropic-version: 2023-06-01" \ --header "anthropic-beta: advanced-tool-use-2025-11-20" \ --header "content-type: application/json" \ --data '{ "model": "claude-sonnet-4-5-20250929", "max_tokens": 2048, "messages": [ { "role": "user", "content": "What is the weather in San Francisco?" } ], "tools": [ { "type": "tool_search_tool_regex_20251119", "name": "tool_search_tool_regex" }, { "name": "get_weather", "description": "Get the weather at a specific location", "input_schema": { "type": "object", "properties": { "location": {"type": "string"}, "unit": { "type": "string", "enum": ["celsius", "fahrenheit"] } }, "required": ["location"] }, "defer_loading": true }, { "name": "search_files", "description": "Search through files in the workspace", "input_schema": { "type": "object", "properties": { "query": {"type": "string"}, "file_types": { "type": "array", "items": {"type": "string"} } }, "required": ["query"] }, "defer_loading": true } ] }'The tool search tool has two variants:
{ "type": "tool_search_tool_regex_20251119", "name": "tool_search_tool_regex" }{ "type": "tool_search_tool_bm25_20251119", "name": "tool_search_tool_bm25" }Regex variant query format: Python regex, NOT natural language
When using tool_search_tool_regex_20251119, Claude constructs regex patterns using Python's re.search() syntax, not natural language queries. Common patterns:
"weather" - matches tool names/descriptions containing "weather""get_.*_data" - matches tools like get_user_data, get_weather_data"database.*query|query.*database" - OR patterns for flexibility"(?i)slack" - case-insensitive searchMaximum query length: 200 characters
BM25 variant query format: Natural language
When using tool_search_tool_bm25_20251119, Claude uses natural language queries to search for tools.
Mark tools for on-demand loading by adding defer_loading: true:
{ "name": "get_weather", "description": "Get current weather for a location", "input_schema": { "type": "object", "properties": { "location": { "type": "string" }, "unit": { "type": "string", "enum": ["celsius", "fahrenheit"] } }, "required": ["location"] }, "defer_loading": true }Key points:
defer_loading are loaded into context immediatelydefer_loading: true are only loaded when Claude discovers them via searchdefer_loading: trueBoth tool search variants (regex and bm25) search tool names, descriptions, argument names, and argument descriptions.
When Claude uses the tool search tool, the response includes new block types:
{ "role": "assistant", "content": [ { "type": "text", "text": "I'll search for tools to help with the weather information." }, { "type": "server_tool_use", "id": "srvtoolu_01ABC123", "name": "tool_search_tool_regex", "input": { "query": "weather" } }, { "type": "tool_search_tool_result", "tool_use_id": "srvtoolu_01ABC123", "content": { "type": "tool_search_tool_search_result", "tool_references": [{ "type": "tool_reference", "tool_name": "get_weather" }] } }, { "type": "text", "text": "I found a weather tool. Let me get the weather for San Francisco." }, { "type": "tool_use", "id": "toolu_01XYZ789", "name": "get_weather", "input": { "location": "San Francisco", "unit": "fahrenheit" } } ], "stop_reason": "tool_use" }server_tool_use: Indicates Claude is invoking the tool search tooltool_search_tool_result: Contains the search results with a nested tool_search_tool_search_result objecttool_references: Array of tool_reference objects pointing to discovered toolstool_use: Claude invoking the discovered toolThe tool_reference blocks are automatically expanded into full tool definitions before being shown to Claude. You don't need to handle this expansion yourself. It happens automatically in the API as long as you provide all matching tool definitions in the tools parameter.
The tool search tool works with MCP servers. Add the "mcp-client-2025-11-20" beta header to your API request, and then use mcp_toolset with default_config to defer loading MCP tools:
curl https://api.anthropic.com/v1/messages \ --header "x-api-key: $ANTHROPIC_API_KEY" \ --header "anthropic-version: 2023-06-01" \ --header "anthropic-beta: advanced-tool-use-2025-11-20,mcp-client-2025-11-20" \ --header "content-type: application/json" \ --data '{ "model": "claude-sonnet-4-5-20250929", "max_tokens": 2048, "mcp_servers": [ { "type": "url", "name": "database-server", "url": "https://mcp-db.example.com" } ], "tools": [ { "type": "tool_search_tool_regex_20251119", "name": "tool_search_tool_regex" }, { "type": "mcp_toolset", "mcp_server_name": "database-server", "default_config": { "defer_loading": true }, "configs": { "search_events": { "defer_loading": false } } } ], "messages": [ { "role": "user", "content": "What events are in my database?" } ] }'MCP configuration options:
default_config.defer_loading: Set default for all tools from the MCP serverconfigs: Override defaults for specific tools by nameYou can implement your own tool search logic (e.g., using embeddings or semantic search) by returning tool_reference blocks from a custom tool:
{ "type": "tool_search_tool_result", "tool_use_id": "toolu_custom_search", "content": { "type": "tool_search_tool_search_result", "tool_references": [{ "type": "tool_reference", "tool_name": "discovered_tool_name" }] } }Every tool referenced must have a corresponding tool definition in the top-level tools parameter with defer_loading: true. This approach lets you use more sophisticated search algorithms while maintaining compatibility with the tool search system.
For a complete example using embeddings, see our tool search with embeddings cookbook.
The tool search tool is not compatible with tool use examples. If you need to provide examples of tool usage, use standard tool calling without tool search.
These errors prevent the request from being processed:
All tools deferred:
{ "type": "error", "error": { "type": "invalid_request_error", "message": "All tools have defer_loading set. At least one tool must be non-deferred." } }Missing tool definition:
{ "type": "error", "error": { "type": "invalid_request_error", "message": "Tool reference 'unknown_tool' has no corresponding tool definition" } }Errors during tool execution return a 200 response with error information in the body:
{ "type": "tool_result", "tool_use_id": "srvtoolu_01ABC123", "content": { "type": "tool_search_tool_result_error", "error_code": "invalid_pattern" } }Error codes:
too_many_requests: Rate limit exceeded for tool search operationsinvalid_pattern: Malformed regex patternpattern_too_long: Pattern exceeds 200 character limitunavailable: Tool search service temporarily unavailableTool search works with prompt caching. Add cache_control breakpoints to optimize multi-turn conversations:
import anthropic client = anthropic.Anthropic() # First request with tool search messages = [ { "role": "user", "content": "What's the weather in Seattle?" } ] response1 = client.beta.messages.create( model="claude-sonnet-4-5-20250929", betas=["advanced-tool-use-2025-11-20"], max_tokens=2048, messages=messages, tools=[ { "type": "tool_search_tool_regex_20251119", "name": "tool_search_tool_regex" }, { "name": "get_weather", "description": "Get weather for a location", "input_schema": { "type": "object", "properties": { "location": {"type": "string"} }, "required": ["location"] }, "defer_loading": True } ] ) # Add Claude's response to conversation messages.append({ "role": "assistant", "content": response1.content }) # Second request with cache breakpoint messages.append({ "role": "user", "content": "What about New York?", "cache_control": {"type": "ephemeral"} }) response2 = client.beta.messages.create( model="claude-sonnet-4-5-20250929", betas=["advanced-tool-use-2025-11-20"], max_tokens=2048, messages=messages, tools=[ { "type": "tool_search_tool_regex_20251119", "name": "tool_search_tool_regex" }, { "name": "get_weather", "description": "Get weather for a location", "input_schema": { "type": "object", "properties": { "location": {"type": "string"} }, "required": ["location"] }, "defer_loading": True } ] ) print(f"Cache read tokens: {response2.usage.get('cache_read_input_tokens', 0)}")The system automatically expands tool_reference blocks throughout the entire conversation history, so Claude can reuse discovered tools in subsequent turns without re-searching.
With streaming enabled, you'll receive tool search events as part of the stream:
event: content_block_start data: {"type": "content_block_start", "index": 1, "content_block": {"type": "server_tool_use", "id": "srvtoolu_xyz789", "name": "tool_search_tool_regex"}} // Search query streamed event: content_block_delta data: {"type": "content_block_delta", "index": 1, "delta": {"type": "input_json_delta", "partial_json": "{\"query\":\"weather\"}"}} // Pause while search executes // Search results streamed event: content_block_start data: {"type": "content_block_start", "index": 2, "content_block": {"type": "tool_search_tool_result", "tool_use_id": "srvtoolu_xyz789", "content": {"type": "tool_search_tool_search_result", "tool_references": [{"type": "tool_reference", "tool_name": "get_weather"}]}}} // Claude continues with discovered toolsYou can include the tool search tool in the Messages Batches API. Tool search operations through the Messages Batches API are priced the same as those in regular Messages API requests.
Good use cases:
When traditional tool calling might be better:
Tool search tool usage is tracked in the response usage object:
{ "usage": { "input_tokens": 1024, "output_tokens": 256, "server_tool_use": { "tool_search_requests": 2 } } }