Skip to content

Pull requests: vllm-project/vllm

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

Refactor MistralTokenizer ci/build documentation Improvements or additions to documentation frontend multi-modality Related to multi-modality (#4194) tool-calling
#26358 opened Oct 7, 2025 by juliendenize Loading…
2 of 5 tasks
[Misc] add usedforsecurity=False in md5 hash call new-model Requests to new models
#26357 opened Oct 7, 2025 by dtrifiro Loading…
fix json schema alias serializing when streaming frontend
#26356 opened Oct 7, 2025 by WoutDeRijck Loading…
4 of 6 tasks
Issue 20283 level documentation Improvements or additions to documentation llama Related to Llama models speculative-decoding tpu Related to Google TPUs v1
#26355 opened Oct 7, 2025 by morrison-turnansky Loading…
5 tasks
Enable RMSNorm substitution for Transformers backend
#26353 opened Oct 7, 2025 by hmellor Loading…
[NIXL] Fix KeyError on abort-after-finished kv-connector ready ONLY add when PR is ready to merge/full CI is needed v1
#26351 opened Oct 7, 2025 by markmc Loading…
[Bugfix] fix TPU model runner cache token_ids_cpu tpu Related to Google TPUs v1
#26348 opened Oct 7, 2025 by yannicks1 Loading…
Add SwigluOAI implementation for CPUFusedMOE
#26347 opened Oct 7, 2025 by isharif168 Loading…
5 tasks
[Bugfix]fix Qwen3 xml tool parser frontend qwen Related to Qwen models tool-calling
#26345 opened Oct 7, 2025 by Zhikaiiii Loading…
[Model] Lfm2Moe documentation Improvements or additions to documentation new-model Requests to new models ready ONLY add when PR is ready to merge/full CI is needed
#26344 opened Oct 7, 2025 by paulpak58 Loading…
2 of 5 tasks
[Misc] Move LRUCache into its own file multi-modality Related to multi-modality (#4194) ready ONLY add when PR is ready to merge/full CI is needed
#26342 opened Oct 7, 2025 by DarkLight1337 Loading…
5 tasks
[V0 Deprecation] Remove VLLM_USE_V1 from tests ci/build kv-connector multi-modality Related to multi-modality (#4194) ready ONLY add when PR is ready to merge/full CI is needed speculative-decoding structured-output tpu Related to Google TPUs v1
#26341 opened Oct 7, 2025 by DarkLight1337 Loading…
5 tasks
[MM][Feat] Add support for audio in video in Qwen2.5-Omni documentation Improvements or additions to documentation qwen Related to Qwen models v1
#26334 opened Oct 7, 2025 by wwl2755 Draft
Revert #24446 and #26168 v1
#26332 opened Oct 7, 2025 by tdoublep Loading…
1 of 5 tasks
[deepseek] add EP8 FusedMOE config for H200 and B200 deepseek Related to DeepSeek models
#26331 opened Oct 7, 2025 by heheda12345 Loading…
5 tasks
Move online quantization to model.load_weights deepseek Related to DeepSeek models llama Related to Llama models qwen Related to Qwen models
#26327 opened Oct 7, 2025 by jerryzh168 Draft
Bump Flashinfer to v0.4.0rc4 ci/build ready ONLY add when PR is ready to merge/full CI is needed v1
#26326 opened Oct 7, 2025 by elvischenv Loading…
5 tasks
[Bugfix] Add missing sink tensor into flash attn cascade attn implementation ready ONLY add when PR is ready to merge/full CI is needed v1
#26325 opened Oct 6, 2025 by plliao Loading…
3 of 5 tasks
CLI: Ignore SIGTSTP signal (Ctrl-Z) frontend
#26323 opened Oct 6, 2025 by thommahoney-google Loading…
2 of 5 tasks
[Bug] Fix Shape Validation for Fallback while Enabling E8M0 for DeepGEMM ready ONLY add when PR is ready to merge/full CI is needed
#26322 opened Oct 6, 2025 by yewentao256 Loading…
ProTip! Follow long discussions with comments:>50.