Skip to content

Conversation

WoutDeRijck
Copy link

@WoutDeRijck WoutDeRijck commented Oct 7, 2025

Purpose

Fix a serialization bug in streaming responses where Pydantic field aliases (e.g. schemaschema_) were not preserved during .model_dump() calls.

This caused the "schema_" key to appear instead of "schema" in streamed response events for JSON schema output formats, breaking compatibility with the OpenAI SDK’s ResponseFormatTextJSONSchemaConfig parsing.

Related issue: vllm-project/vllm#26288

Root Cause

  • ResponsesResponse.from_request(...).model_dump() was called without by_alias=True at:
    • vllm/entrypoints/openai/serving_responses.py:1830
    • vllm/entrypoints/openai/serving_responses.py:1879
  • Without by_alias=True, Pydantic outputs internal field names (e.g. schema_) instead of their aliases (schema), causing validation errors downstream.

Fix

Add by_alias=True to both .model_dump() calls so serialized responses use the correct alias names consistent with OpenAI schema expectations.

# Before initial_response = ResponsesResponse.from_request(...).model_dump() # After initial_response = ResponsesResponse.from_request(...).model_dump(by_alias=True)

and

response=final_response.model_dump(by_alias=True)

Test Plan

  1. Setup

    • vllm==0.11.0
    • openai==1.108.0
  2. Reproduce the Bug (before fix)

    stream = await client.responses.create( model=model, input=formatted_prompt, text={"format": {"name": "schema_ner", "schema": json_schema, "type": "json_schema", "strict": True}}, stream=True, )

    Observe the first streamed event includes "schema_" instead of "schema".

  3. Apply the Fix

    • Add by_alias=True in both .model_dump() calls.
    • Rebuild and rerun the same request.
  4. Expected Behavior

    • Streamed events now correctly include "schema" key.
    • No validation error occurs when parsing through OpenAI SDK or FastAPI’s Pydantic model.

Test Result

Before fix

  • Streaming response JSON contained "schema_"
  • Validation failed with missing "schema" field

After fix

  • Streaming response JSON correctly uses "schema"
  • Validation passes
  • Structured outputs parse successfully in both streaming and non-streaming modes

Example (after fix):

{ "text": { "format": { "name": "schema_ner", "schema": { ... }, "type": "json_schema", "strict": true } } }

Fix is effective, without this fix following errors persist

[1;36m(APIServer pid=7)�[0;0m \| response.text.format.ResponseFormatTextJSONSchemaConfig.schema [1;36m(APIServer pid=7)�[0;0m \| Field required [type=missing, input_value={'name': 'schema_ner', 's...': None, 'strict': True}, input_type=dict]


Essential Elements of an Effective PR Description Checklist
  • Purpose of the PR
  • Test plan provided
  • Test results before and after
  • Links to related issue(s)
  • (Optional) Documentation update — not required
  • (Optional) Release notes update — internal behavioral fix only

BEFORE SUBMITTING: see vLLM contributing guide

Signed-off-by: WoutDeRijck <derijck.2001@icloud.com>
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request aims to fix a serialization bug in streaming responses where Pydantic field aliases were not being used. The provided change correctly addresses this for the response.created event by adding by_alias=True to the model_dump() call. However, the fix is incomplete. A similar issue persists for the response.completed event, as the final_response object is not serialized with the correct alias settings before being sent. I've left a critical comment detailing the necessary change to fully resolve the bug.

@WoutDeRijck
Copy link
Author

Labels: structured output, streaming

Copy link
Contributor

@qandrew qandrew left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hi @WoutDeRijck , thanks for looking into this! Could you add a unit test in https://github.com/vllm-project/vllm/blob/main/tests/entrypoints/openai/test_response_api_with_harmony.py so we can prevent this behavior in the future?

Signed-off-by: WoutDeRijck <derijck.2001@icloud.com>
@WoutDeRijck
Copy link
Author

Hi @qandrew, I've added the unit test!

type="response.completed",
sequence_number=-1,
response=final_response,
response=final_response.model_dump(by_alias=True),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@WoutDeRijck could you verify that this doesn't break serialization? I just added a PR to revert the model_dump :P not sure if by_alias will cause a different behavior? #26185

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

frontend gpt-oss Related to GPT-OSS models

2 participants