Skip to content

Streaming mode AssistantMessage does not retain toolCalls, causing issues with tool confirmation workflows #3366

@EkkoWH

Description

@EkkoWH

🐞 Bug description
When using the streaming mode (Flux) to call a model that returns tool calls, the aggregated AssistantMessage constructed by MessageAggregator does not contain the toolCalls property. This makes it impossible to retrieve tool call information from previous assistant messages stored in memory.

This behavior becomes particularly problematic when internalToolExecutionEnabled=false, where tool execution is intended to be controlled manually by the user. In such workflows, it's necessary to retrieve tool call information from the last assistant message in memory, but that data is missing due to the above issue.

Note: This issue is not caused by setting internalToolExecutionEnabled=false. Instead, the issue is exacerbated by it, since downstream components rely on consistent toolCalls data across both streaming and non-streaming modes.

💻 Environment
Spring AI Version: 1.0.0

Java Version: 17

Model: Qwen2.5-72B-Instruct

Usage Mode: Streaming (Flux)

Tool Execution Mode: internalToolExecutionEnabled=false

Vector store: Not involved

🪜 Steps to reproduce
Configure a Spring AI chat client with streaming mode enabled.

Ensure that the response from the model includes tool calls.

Use org.springframework.ai.chat.client.advisor.MessageChatMemoryAdvisor;

Inspect the AssistantMessage – toolCalls is missing.

Attempt to retrieve tool calls from memory (e.g., ChatMemory#getMessages) – fails.

✅ Expected behavior
The toolCalls property from the GenerationMetadata should be correctly propagated to the resulting AssistantMessage, regardless of whether streaming or non-streaming mode is used. This ensures consistent memory behavior and supports downstream workflows such as manual tool execution confirmation.

🧪 Minimal Complete Reproducible example

My AI Config:
@bean
public ChatMemoryRepository chatMemoryRepository() {
return new InMemoryChatMemoryRepository();
}

@Bean public ChatMemory chatMemory(ChatMemoryRepository chatMemoryRepository) { return MessageWindowChatMemory.builder().maxMessages(10).chatMemoryRepository(chatMemoryRepository).build(); } @Bean public OpenAiChatModel chatModel(OpenAiApi openAiApi, ToolCallingManager toolCallingManager, List<AgentToolsProvider> agentToolsProviders) { AgentToolsProvider[] providers = agentToolsProviders.toArray(new AgentToolsProvider[0]); ToolCallback[] toolCallbacks = ToolCallbacks.from((Object[]) providers); OpenAiChatOptions chatOptions = OpenAiChatOptions.builder() .temperature(0.6) .model("qwen2.5-72b-instruct") .internalToolExecutionEnabled(false) .toolCallbacks(toolCallbacks) .build(); return OpenAiChatModel.builder() .defaultOptions(chatOptions) .toolCallingManager(toolCallingManager) .openAiApi(openAiApi) .build(); } @Bean ChatClient chatClient(OpenAiChatModel chatModel, ChatMemory chatMemory) { return ChatClient.builder(chatModel) .defaultAdvisors( new SimpleLoggerAdvisor(), MessageChatMemoryAdvisor.builder(chatMemory).build() ) .defaultSystem(systemResource) .build(); } 

Just chat with AI:

private Flux<ChatResponse> callWithMemory(String conversationId, String userText) { Prompt promptWithMemory = new Prompt(chatMemory.get(conversationId), chatModel.getDefaultOptions()); return client.prompt(promptWithMemory) .user(userText) .advisors(a -> a.param(ChatMemory.CONVERSATION_ID, conversationId)) .stream() .chatResponse(); } 

Image

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions