Skip to content

Commit 9f1c642

Browse files
DoubleVII杨森gemini-code-assist[bot]
authored
[Bugfix] fix Qwen2.5-Omni processor output mapping (vllm-project#23058)
Signed-off-by: double7 <33449816+DoubleVII@users.noreply.github.com> Co-authored-by: 杨森 <yangsen.double7@bytedance.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
1 parent 7be3a59 commit 9f1c642

File tree

1 file changed

+5
-0
lines changed

1 file changed

+5
-0
lines changed

vllm/model_executor/models/qwen2_5_omni_thinker.py

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -88,6 +88,11 @@ def _qwen2_5_omni_thinker_field_config(hf_inputs: Mapping[str, torch.Tensor]):
8888
video_grid_thw = hf_inputs.get("video_grid_thw", torch.empty((0, 3)))
8989
video_grid_sizes = video_grid_thw.prod(-1)
9090

91+
# vllm use `second_per_grid_ts` to compute multimodal rotary embedding
92+
video_second_per_grid = hf_inputs.get("video_second_per_grid", None)
93+
if video_second_per_grid is not None:
94+
hf_inputs["second_per_grid_ts"] = video_second_per_grid
95+
9196
return dict(
9297
input_audio_features=MultiModalFieldConfig.flat_from_sizes(
9398
"audio", audio_feature_lengths, dim=1),

0 commit comments

Comments
 (0)