Skip to content

Commit d9ab1ad

Browse files
authored
reasoning_content -> reasoning (vllm-project#27752)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
1 parent 608bb14 commit d9ab1ad

File tree

46 files changed

+428
-438
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

46 files changed

+428
-438
lines changed

docs/features/reasoning_outputs.md

Lines changed: 24 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,10 @@
22

33
vLLM offers support for reasoning models like [DeepSeek R1](https://huggingface.co/deepseek-ai/DeepSeek-R1), which are designed to generate outputs containing both reasoning steps and final conclusions.
44

5-
Reasoning models return an additional `reasoning_content` field in their outputs, which contains the reasoning steps that led to the final conclusion. This field is not present in the outputs of other models.
5+
Reasoning models return an additional `reasoning` field in their outputs, which contains the reasoning steps that led to the final conclusion. This field is not present in the outputs of other models.
6+
7+
!!! warning
8+
`reasoning` used to be called `reasoning_content`. For now, `reasoning_content` will continue to work. However, we encourage you to migrate to `reasoning` in case `reasoning_content` is removed in future.
69

710
## Supported Models
811

@@ -61,18 +64,18 @@ Next, make a request to the model that should return the reasoning content in th
6164
# extra_body={"chat_template_kwargs": {"enable_thinking": False}}
6265
response = client.chat.completions.create(model=model, messages=messages)
6366

64-
reasoning_content = response.choices[0].message.reasoning_content
67+
reasoning = response.choices[0].message.reasoning
6568
content = response.choices[0].message.content
6669

67-
print("reasoning_content:", reasoning_content)
70+
print("reasoning:", reasoning)
6871
print("content:", content)
6972
```
7073

71-
The `reasoning_content` field contains the reasoning steps that led to the final conclusion, while the `content` field contains the final conclusion.
74+
The `reasoning` field contains the reasoning steps that led to the final conclusion, while the `content` field contains the final conclusion.
7275

7376
## Streaming chat completions
7477

75-
Streaming chat completions are also supported for reasoning models. The `reasoning_content` field is available in the `delta` field in [chat completion response chunks](https://platform.openai.com/docs/api-reference/chat/streaming).
78+
Streaming chat completions are also supported for reasoning models. The `reasoning` field is available in the `delta` field in [chat completion response chunks](https://platform.openai.com/docs/api-reference/chat/streaming).
7679

7780
??? console "Json"
7881

@@ -88,7 +91,7 @@ Streaming chat completions are also supported for reasoning models. The `reasoni
8891
"index": 0,
8992
"delta": {
9093
"role": "assistant",
91-
"reasoning_content": "is",
94+
"reasoning": "is",
9295
},
9396
"logprobs": null,
9497
"finish_reason": null
@@ -97,7 +100,7 @@ Streaming chat completions are also supported for reasoning models. The `reasoni
97100
}
98101
```
99102

100-
OpenAI Python client library does not officially support `reasoning_content` attribute for streaming output. But the client supports extra attributes in the response. You can use `hasattr` to check if the `reasoning_content` attribute is present in the response. For example:
103+
OpenAI Python client library does not officially support `reasoning` attribute for streaming output. But the client supports extra attributes in the response. You can use `hasattr` to check if the `reasoning` attribute is present in the response. For example:
101104

102105
??? code
103106

@@ -127,22 +130,22 @@ OpenAI Python client library does not officially support `reasoning_content` att
127130
)
128131

129132
print("client: Start streaming chat completions...")
130-
printed_reasoning_content = False
133+
printed_reasoning = False
131134
printed_content = False
132135

133136
for chunk in stream:
134-
# Safely extract reasoning_content and content from delta,
137+
# Safely extract reasoning and content from delta,
135138
# defaulting to None if attributes don't exist or are empty strings
136-
reasoning_content = (
137-
getattr(chunk.choices[0].delta, "reasoning_content", None) or None
139+
reasoning = (
140+
getattr(chunk.choices[0].delta, "reasoning", None) or None
138141
)
139142
content = getattr(chunk.choices[0].delta, "content", None) or None
140143

141-
if reasoning_content is not None:
142-
if not printed_reasoning_content:
143-
printed_reasoning_content = True
144-
print("reasoning_content:", end="", flush=True)
145-
print(reasoning_content, end="", flush=True)
144+
if reasoning is not None:
145+
if not printed_reasoning:
146+
printed_reasoning = True
147+
print("reasoning:", end="", flush=True)
148+
print(reasoning, end="", flush=True)
146149
elif content is not None:
147150
if not printed_content:
148151
printed_content = True
@@ -151,11 +154,11 @@ OpenAI Python client library does not officially support `reasoning_content` att
151154
print(content, end="", flush=True)
152155
```
153156

154-
Remember to check whether the `reasoning_content` exists in the response before accessing it. You could check out the [example](https://github.com/vllm-project/vllm/blob/main/examples/online_serving/openai_chat_completion_with_reasoning_streaming.py).
157+
Remember to check whether the `reasoning` exists in the response before accessing it. You could check out the [example](https://github.com/vllm-project/vllm/blob/main/examples/online_serving/openai_chat_completion_with_reasoning_streaming.py).
155158

156159
## Tool Calling
157160

158-
The reasoning content is also available when both tool calling and the reasoning parser are enabled. Additionally, tool calling only parses functions from the `content` field, not from the `reasoning_content`.
161+
The reasoning content is also available when both tool calling and the reasoning parser are enabled. Additionally, tool calling only parses functions from the `content` field, not from the `reasoning`.
159162

160163
??? code
161164

@@ -192,7 +195,7 @@ The reasoning content is also available when both tool calling and the reasoning
192195
print(response)
193196
tool_call = response.choices[0].message.tool_calls[0].function
194197

195-
print(f"reasoning_content: {response.choices[0].message.reasoning_content}")
198+
print(f"reasoning: {response.choices[0].message.reasoning}")
196199
print(f"Function called: {tool_call.name}")
197200
print(f"Arguments: {tool_call.arguments}")
198201
```
@@ -223,7 +226,7 @@ You can add a new `ReasoningParser` similar to [vllm/reasoning/deepseek_r1_reaso
223226
def __init__(self, tokenizer: AnyTokenizer):
224227
super().__init__(tokenizer)
225228

226-
def extract_reasoning_content_streaming(
229+
def extract_reasoning_streaming(
227230
self,
228231
previous_text: str,
229232
current_text: str,
@@ -240,7 +243,7 @@ You can add a new `ReasoningParser` similar to [vllm/reasoning/deepseek_r1_reaso
240243
previously been parsed and extracted (see constructor)
241244
"""
242245

243-
def extract_reasoning_content(
246+
def extract_reasoning(
244247
self,
245248
model_output: str,
246249
request: ChatCompletionRequest | ResponsesRequest,

docs/features/structured_outputs.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -204,7 +204,7 @@ Note that you can use reasoning with any provided structured outputs feature. Th
204204
}
205205
},
206206
)
207-
print("reasoning_content: ", completion.choices[0].message.reasoning_content)
207+
print("reasoning: ", completion.choices[0].message.reasoning)
208208
print("content: ", completion.choices[0].message.content)
209209
```
210210

examples/online_serving/openai_chat_completion_tool_calls_with_reasoning.py

Lines changed: 11 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
# SPDX-FileCopyrightText: Copyright contributors to the vLLM project
33
"""
44
An example demonstrates how to use tool calling with reasoning models
5-
like QwQ-32B. The reasoning_content will not be parsed by the tool
5+
like QwQ-32B. The reasoning will not be parsed by the tool
66
calling process; only the final output will be parsed.
77
88
To run this example, you need to start the vLLM server with both
@@ -78,7 +78,7 @@ def get_current_weather(city: str, state: str, unit: "str"):
7878

7979

8080
def extract_reasoning_and_calls(chunks: list):
81-
reasoning_content = ""
81+
reasoning = ""
8282
tool_call_idx = -1
8383
arguments = []
8484
function_names = []
@@ -97,9 +97,9 @@ def extract_reasoning_and_calls(chunks: list):
9797
if tool_call.function.arguments:
9898
arguments[tool_call_idx] += tool_call.function.arguments
9999
else:
100-
if hasattr(chunk.choices[0].delta, "reasoning_content"):
101-
reasoning_content += chunk.choices[0].delta.reasoning_content
102-
return reasoning_content, arguments, function_names
100+
if hasattr(chunk.choices[0].delta, "reasoning"):
101+
reasoning += chunk.choices[0].delta.reasoning
102+
return reasoning, arguments, function_names
103103

104104

105105
def main():
@@ -115,7 +115,7 @@ def main():
115115
tool_calls = client.chat.completions.create(
116116
messages=messages, model=model, tools=tools
117117
)
118-
print(f"reasoning_content: {tool_calls.choices[0].message.reasoning_content}")
118+
print(f"reasoning: {tool_calls.choices[0].message.reasoning}")
119119
print(f"function name: {tool_calls.choices[0].message.tool_calls[0].function.name}")
120120
print(
121121
f"function arguments: "
@@ -129,9 +129,9 @@ def main():
129129

130130
chunks = list(tool_calls_stream)
131131

132-
reasoning_content, arguments, function_names = extract_reasoning_and_calls(chunks)
132+
reasoning, arguments, function_names = extract_reasoning_and_calls(chunks)
133133

134-
print(f"reasoning_content: {reasoning_content}")
134+
print(f"reasoning: {reasoning}")
135135
print(f"function name: {function_names[0]}")
136136
print(f"function arguments: {arguments[0]}")
137137

@@ -144,7 +144,7 @@ def main():
144144
)
145145

146146
tool_call = tool_calls.choices[0].message.tool_calls[0].function
147-
print(f"reasoning_content: {tool_calls.choices[0].message.reasoning_content}")
147+
print(f"reasoning: {tool_calls.choices[0].message.reasoning}")
148148
print(f"function name: {tool_call.name}")
149149
print(f"function arguments: {tool_call.arguments}")
150150
print("----------Stream Generate With Named Function Calling--------------")
@@ -159,8 +159,8 @@ def main():
159159

160160
chunks = list(tool_calls_stream)
161161

162-
reasoning_content, arguments, function_names = extract_reasoning_and_calls(chunks)
163-
print(f"reasoning_content: {reasoning_content}")
162+
reasoning, arguments, function_names = extract_reasoning_and_calls(chunks)
163+
print(f"reasoning: {reasoning}")
164164
print(f"function name: {function_names[0]}")
165165
print(f"function arguments: {arguments[0]}")
166166
print("\n\n")

examples/online_serving/openai_chat_completion_with_reasoning.py

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -38,10 +38,10 @@ def main():
3838
# For granite, add: `extra_body={"chat_template_kwargs": {"thinking": True}}`
3939
response = client.chat.completions.create(model=model, messages=messages)
4040

41-
reasoning_content = response.choices[0].message.reasoning_content
41+
reasoning = response.choices[0].message.reasoning
4242
content = response.choices[0].message.content
4343

44-
print("reasoning_content for Round 1:", reasoning_content)
44+
print("reasoning for Round 1:", reasoning)
4545
print("content for Round 1:", content)
4646

4747
# Round 2
@@ -54,10 +54,10 @@ def main():
5454
)
5555
response = client.chat.completions.create(model=model, messages=messages)
5656

57-
reasoning_content = response.choices[0].message.reasoning_content
57+
reasoning = response.choices[0].message.reasoning
5858
content = response.choices[0].message.content
5959

60-
print("reasoning_content for Round 2:", reasoning_content)
60+
print("reasoning for Round 2:", reasoning)
6161
print("content for Round 2:", content)
6262

6363

examples/online_serving/openai_chat_completion_with_reasoning_streaming.py

Lines changed: 9 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@
2020
where you want to display chat completions to the user as they are generated
2121
by the model.
2222
23-
Remember to check content and reasoning_content exist in `ChatCompletionChunk`,
23+
Remember to check content and reasoning exist in `ChatCompletionChunk`,
2424
content may not exist leading to errors if you try to access it.
2525
"""
2626

@@ -47,22 +47,20 @@ def main():
4747
stream = client.chat.completions.create(model=model, messages=messages, stream=True)
4848

4949
print("client: Start streaming chat completions...")
50-
printed_reasoning_content = False
50+
printed_reasoning = False
5151
printed_content = False
5252

5353
for chunk in stream:
54-
# Safely extract reasoning_content and content from delta,
54+
# Safely extract reasoning and content from delta,
5555
# defaulting to None if attributes don't exist or are empty strings
56-
reasoning_content = (
57-
getattr(chunk.choices[0].delta, "reasoning_content", None) or None
58-
)
56+
reasoning = getattr(chunk.choices[0].delta, "reasoning", None) or None
5957
content = getattr(chunk.choices[0].delta, "content", None) or None
6058

61-
if reasoning_content is not None:
62-
if not printed_reasoning_content:
63-
printed_reasoning_content = True
64-
print("reasoning_content:", end="", flush=True)
65-
print(reasoning_content, end="", flush=True)
59+
if reasoning is not None:
60+
if not printed_reasoning:
61+
printed_reasoning = True
62+
print("reasoning:", end="", flush=True)
63+
print(reasoning, end="", flush=True)
6664
elif content is not None:
6765
if not printed_content:
6866
printed_content = True

examples/online_serving/streamlit_openai_chatbot_webserver.py

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -159,8 +159,8 @@ def get_llm_response(messages, model, reason, content_ph=None, reasoning_ph=None
159159
for chunk in response:
160160
delta = chunk.choices[0].delta
161161
# Stream reasoning first
162-
if reason and hasattr(delta, "reasoning_content") and live_think:
163-
rc = delta.reasoning_content
162+
if reason and hasattr(delta, "reasoning") and live_think:
163+
rc = delta.reasoning
164164
if rc:
165165
think_text += rc
166166
live_think.markdown(think_text + "▌")
@@ -262,8 +262,8 @@ def server_supports_reasoning():
262262
messages=[{"role": "user", "content": "Hi"}],
263263
stream=False,
264264
)
265-
return hasattr(resp.choices[0].message, "reasoning_content") and bool(
266-
resp.choices[0].message.reasoning_content
265+
return hasattr(resp.choices[0].message, "reasoning") and bool(
266+
resp.choices[0].message.reasoning
267267
)
268268

269269

examples/online_serving/structured_outputs/structured_outputs.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,7 @@ async def print_stream_response(
3333
async for chunk in stream_response:
3434
delta = chunk.choices[0].delta
3535

36-
reasoning_chunk_text: str | None = getattr(delta, "reasoning_content", None)
36+
reasoning_chunk_text: str | None = getattr(delta, "reasoning", None)
3737
content_chunk_text = delta.content
3838

3939
if args.reasoning:
@@ -255,8 +255,8 @@ async def cli():
255255
for constraint, response in zip(constraints, results):
256256
print(f"\n\n{constraint}:")
257257
message = response.choices[0].message
258-
if args.reasoning and hasattr(message, "reasoning_content"):
259-
print(f" Reasoning: {message.reasoning_content or ''}")
258+
if args.reasoning and hasattr(message, "reasoning"):
259+
print(f" Reasoning: {message.reasoning or ''}")
260260
print(f" Content: {message.content!r}")
261261

262262

tests/entrypoints/openai/test_chat_with_tool_reasoning.py

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -80,7 +80,7 @@ async def client(server):
8080

8181

8282
def extract_reasoning_and_calls(chunks: list):
83-
reasoning_content = ""
83+
reasoning = ""
8484
tool_call_idx = -1
8585
arguments = []
8686
function_names = []
@@ -99,9 +99,9 @@ def extract_reasoning_and_calls(chunks: list):
9999
if tool_call.function.arguments:
100100
arguments[tool_call_idx] += tool_call.function.arguments
101101
else:
102-
if hasattr(chunk.choices[0].delta, "reasoning_content"):
103-
reasoning_content += chunk.choices[0].delta.reasoning_content
104-
return reasoning_content, arguments, function_names
102+
if hasattr(chunk.choices[0].delta, "reasoning"):
103+
reasoning += chunk.choices[0].delta.reasoning
104+
return reasoning, arguments, function_names
105105

106106

107107
# test streaming
@@ -119,8 +119,8 @@ async def test_chat_streaming_of_tool_and_reasoning(client: openai.AsyncOpenAI):
119119
async for chunk in stream:
120120
chunks.append(chunk)
121121

122-
reasoning_content, arguments, function_names = extract_reasoning_and_calls(chunks)
123-
assert len(reasoning_content) > 0
122+
reasoning, arguments, function_names = extract_reasoning_and_calls(chunks)
123+
assert len(reasoning) > 0
124124
assert len(function_names) > 0 and function_names[0] == FUNC_NAME
125125
assert len(arguments) > 0 and arguments[0] == FUNC_ARGS
126126

@@ -136,6 +136,6 @@ async def test_chat_full_of_tool_and_reasoning(client: openai.AsyncOpenAI):
136136
stream=False,
137137
)
138138

139-
assert len(tool_calls.choices[0].message.reasoning_content) > 0
139+
assert len(tool_calls.choices[0].message.reasoning) > 0
140140
assert tool_calls.choices[0].message.tool_calls[0].function.name == FUNC_NAME
141141
assert tool_calls.choices[0].message.tool_calls[0].function.arguments == FUNC_ARGS

tests/entrypoints/openai/test_completion_with_function_calling.py

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -180,8 +180,8 @@ async def test_function_tool_use(
180180
extra_body={"chat_template_kwargs": {"enable_thinking": enable_thinking}},
181181
)
182182
if enable_thinking:
183-
assert chat_completion.choices[0].message.reasoning_content is not None
184-
assert chat_completion.choices[0].message.reasoning_content != ""
183+
assert chat_completion.choices[0].message.reasoning is not None
184+
assert chat_completion.choices[0].message.reasoning != ""
185185
assert chat_completion.choices[0].message.tool_calls is not None
186186
assert len(chat_completion.choices[0].message.tool_calls) > 0
187187
else:
@@ -200,9 +200,9 @@ async def test_function_tool_use(
200200
async for chunk in output_stream:
201201
if chunk.choices:
202202
if enable_thinking and getattr(
203-
chunk.choices[0].delta, "reasoning_content", None
203+
chunk.choices[0].delta, "reasoning", None
204204
):
205-
reasoning.append(chunk.choices[0].delta.reasoning_content)
205+
reasoning.append(chunk.choices[0].delta.reasoning)
206206
if chunk.choices[0].delta.tool_calls:
207207
output.extend(chunk.choices[0].delta.tool_calls)
208208

tests/entrypoints/openai/test_run_batch.py

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -232,9 +232,9 @@ def test_reasoning_parser():
232232
assert isinstance(line_dict, dict)
233233
assert line_dict["error"] is None
234234

235-
# Check that reasoning_content is present and not empty
236-
reasoning_content = line_dict["response"]["body"]["choices"][0]["message"][
237-
"reasoning_content"
235+
# Check that reasoning is present and not empty
236+
reasoning = line_dict["response"]["body"]["choices"][0]["message"][
237+
"reasoning"
238238
]
239-
assert reasoning_content is not None
240-
assert len(reasoning_content) > 0
239+
assert reasoning is not None
240+
assert len(reasoning) > 0

0 commit comments

Comments
 (0)