Skip to content

Conversation

@HaloWorld
Copy link
Contributor

@HaloWorld HaloWorld commented Dec 17, 2025

GPT-OSS models using the harmony format were ignoring the tool_choice="none" parameter and could still trigger tool calls when tools were provided in the request. This issue arose because the _make_request_with_harmony method only checked for the existence of request.tools, without accounting for the tool_choice setting or the exclude_tools_when_tool_choice_none flag.
This fix ensures that harmony models respect the exclude_tools_when_tool_choice_none flag, aligning their behavior with other model types and OpenAI API standards. When tool_choice="none" and the flag is enabled, tool definitions are no longer included in the system message.

Purpose

Modified _make_request_with_harmony to incorporate checks for tool_choice and exclude_tools_when_tool_choice_none.

Test Plan

  • Start vllm server with --exclude-tools-when-tool-choice-none
vllm serve /aifs4su/yujiepu/models/openai/gpt-oss-20b/ \	--served-model-name gpt-oss-20b \	--exclude-tools-when-tool-choice-none \	--max-model-len 8192 \	--tool-call-parser openai \	--enable-auto-tool-choice \	--trust-remote-code \	--max-cudagraph-capture-size 4
  • Test tool-calling
curl -s -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ -d '{  "model": "gpt-oss-20b",  "messages": [  {  "role": "user",  "content": "What is the weather in Dallas, TX?"  }  ],  "tools": [  {  "type": "function",  "function": {  "name": "get_current_weather",  "description": "Get the current weather in a given location",  "parameters": {  "type": "object",  "properties": {  "city": {  "type": "string"  },  "state": {  "type": "string"  },  "unit": {  "type": "string",  "enum": ["celsius", "fahrenheit"]  }  },  "required": ["city", "state", "unit"]  }  }  }  ],  "tool_choice": "none"  }'

Test Result

  • Before
image
  • After
image
Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.
@chatgpt-codex-connector
Copy link

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request addresses a bug where tool_choice="none" was being ignored for GPT-OSS/harmony models. The changes correctly modify _make_request_with_harmony to respect the exclude_tools_when_tool_choice_none flag, preventing tool definitions from being included in the prompt when tool_choice is set to "none". The logic is sound, and a new test case has been added to verify the fix. The changes appear correct and improve consistency. I have no major concerns.

@github-actions
Copy link

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors.

You ask your reviewers to trigger select CI tests on top of fastcheck CI.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

If you have any questions, please reach out to us on Slack at https://slack.vllm.ai.

🚀

Copy link
Collaborator

@chaunceyjiang chaunceyjiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Responses API should also support this.

@chaunceyjiang chaunceyjiang self-assigned this Dec 18, 2025
@mergify
Copy link

mergify bot commented Dec 18, 2025

Hi @HaloWorld, the pre-commit checks have failed. Please run:

uv pip install pre-commit pre-commit install pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Tip

Is mypy or markdownlint failing?
mypy and markdownlint are run differently in CI. If the failure is related to either of these checks, please use the following commands to run them locally:
# For mypy (substitute "3.10" with the failing version if needed) pre-commit run --hook-stage manual mypy-3.10 # For markdownlint pre-commit run --hook-stage manual markdownlint
@HaloWorld
Copy link
Contributor Author

The Responses API should also support this.

Thank you for the feedback. You're absolutely right that the feature should ideally be consistent across our APIs.

Currently, the OpenAIServingResponses class does not have direct support for the exclude_tools_when_tool_choice_none parameter (unlike OpenAIServingChat). Furthermore, the existing Responses API implementation for Harmony currently only supports tool_choice='auto', as seen in serving_responses.py (L576-L584).

Given this context and to adhere to the principle of minimal changes for this PR, would it be acceptable to add support for the Responses API in a separate, follow-up PR? This would allow us to land the core logic for the Chat API promptly while giving due attention to properly extending the Responses API.

Please let me know your thoughts. I'm happy to proceed either way.

Copy link
Collaborator

@chaunceyjiang chaunceyjiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks~

@github-project-automation github-project-automation bot moved this from To Triage to Ready in gpt-oss Issues & Enhancements Dec 18, 2025
@chaunceyjiang chaunceyjiang added the ready ONLY add when PR is ready to merge/full CI is needed label Dec 18, 2025
@HaloWorld
Copy link
Contributor Author

@chaunceyjiang Hi, thank you for reviewing and approving the changes. I've noticed that the CI run has failed on one specific test case:

FAILED entrypoints/openai/test_response_api_with_harmony.py::test_code_interpreter[openai/gpt-oss-20b] - AssertionError: assert '5846' in '9648' 
image

Interestingly, I wasn't able to reproduce this failure locally when running the same test, and it passes on my machine.

image

I'd be happy to provide any additional logs or information that might help investigate the discrepancy between the CI environment and my local setup.

HaloWorld and others added 2 commits December 18, 2025 20:47
Signed-off-by: yujiepu <pyjapple@gmail.com>
Co-authored-by: Chauncey <chaunceyjiang@gmail.com> Signed-off-by: PlatinumGod <pyjapple@gmail.com>
Signed-off-by: yujiepu <pyjapple@gmail.com>
Signed-off-by: yujiepu <pyjapple@gmail.com>
@chaunceyjiang chaunceyjiang merged commit 6a09612 into vllm-project:main Dec 19, 2025
47 checks passed
zRzRzRzRzRzRzR pushed a commit to zRzRzRzRzRzRzR/vllm that referenced this pull request Dec 19, 2025
…ls (vllm-project#30867) Signed-off-by: yujiepu <pyjapple@gmail.com> Signed-off-by: PlatinumGod <pyjapple@gmail.com> Co-authored-by: Chauncey <chaunceyjiang@gmail.com> Signed-off-by: zRzRzRzRzRzRzR <2448370773@qq.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

frontend gpt-oss Related to GPT-OSS models ready ONLY add when PR is ready to merge/full CI is needed

2 participants