-
- Notifications
You must be signed in to change notification settings - Fork 10.6k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Refactor MistralTokenizer ci/build documentation Improvements or additions to documentation frontend multi-modality Related to multi-modality (#4194) tool-calling
#26358 opened Oct 7, 2025 by juliendenize Loading…
2 of 5 tasks
[Misc] add usedforsecurity=False in md5 hash call new-model Requests to new models
#26357 opened Oct 7, 2025 by dtrifiro Loading…
fix json schema alias serializing when streaming frontend
#26356 opened Oct 7, 2025 by WoutDeRijck Loading…
4 of 6 tasks
Issue 20283 level documentation Improvements or additions to documentation llama Related to Llama models speculative-decoding tpu Related to Google TPUs v1
#26355 opened Oct 7, 2025 by morrison-turnansky Loading…
5 tasks
[NIXL] Fix KeyError on abort-after-finished kv-connector ready ONLY add when PR is ready to merge/full CI is needed v1
#26351 opened Oct 7, 2025 by markmc Loading…
[Data-parallel] Allow DP>1 for world_size > num engines per GPU v1
#26350 opened Oct 7, 2025 by patrickvonplaten • Draft
5 tasks
[Bugfix] fix TPU model runner cache token_ids_cpu tpu Related to Google TPUs v1
#26348 opened Oct 7, 2025 by yannicks1 Loading…
Add SwigluOAI implementation for CPUFusedMOE
#26347 opened Oct 7, 2025 by isharif168 Loading…
5 tasks
[Bugfix]fix Qwen3 xml tool parser frontend qwen Related to Qwen models tool-calling
#26345 opened Oct 7, 2025 by Zhikaiiii Loading…
[Model] Lfm2Moe documentation Improvements or additions to documentation new-model Requests to new models ready ONLY add when PR is ready to merge/full CI is needed
#26344 opened Oct 7, 2025 by paulpak58 Loading…
2 of 5 tasks
[Misc] Move Related to multi-modality (#4194) ready ONLY add when PR is ready to merge/full CI is needed
LRUCache
into its own file multi-modality #26342 opened Oct 7, 2025 by DarkLight1337 Loading…
5 tasks
[V0 Deprecation] Remove Related to multi-modality (#4194) ready ONLY add when PR is ready to merge/full CI is needed speculative-decoding structured-output tpu Related to Google TPUs v1
VLLM_USE_V1
from tests ci/build kv-connector multi-modality #26341 opened Oct 7, 2025 by DarkLight1337 Loading…
5 tasks
[Feature] Add support for naver/splade-v3 (BERT-based sparse embedding model) new-model Requests to new models
#26339 opened Oct 7, 2025 by gjgjos Loading…
Creating Fp8ParallelLMHeadMethod for fp8 lm_head with customized efficient scaled fp8 kernel speculative-decoding v1
#26337 opened Oct 7, 2025 by yugong333 Loading…
5 tasks
[MM][Feat] Add support for audio in video in Qwen2.5-Omni documentation Improvements or additions to documentation qwen Related to Qwen models v1
[deepseek] add EP8 FusedMOE config for H200 and B200 deepseek Related to DeepSeek models
#26331 opened Oct 7, 2025 by heheda12345 Loading…
5 tasks
[Chore] fix spelling in kv_transfer_config variable name kv-connector v1
#26330 opened Oct 7, 2025 by natoscott Loading…
Move online quantization to Related to DeepSeek models llama Related to Llama models qwen Related to Qwen models
model.load_weights
deepseek #26327 opened Oct 7, 2025 by jerryzh168 • Draft
Bump Flashinfer to v0.4.0rc4 ci/build ready ONLY add when PR is ready to merge/full CI is needed v1
#26326 opened Oct 7, 2025 by elvischenv Loading…
5 tasks
[Bugfix] Add missing sink tensor into flash attn cascade attn implementation ready ONLY add when PR is ready to merge/full CI is needed v1
#26325 opened Oct 6, 2025 by plliao Loading…
3 of 5 tasks
CLI: Ignore SIGTSTP signal (Ctrl-Z) frontend
#26323 opened Oct 6, 2025 by thommahoney-google Loading…
2 of 5 tasks
[Bug] Fix Shape Validation for Fallback while Enabling E8M0 for DeepGEMM ready ONLY add when PR is ready to merge/full CI is needed
#26322 opened Oct 6, 2025 by yewentao256 Loading…
Previous Next
ProTip! Follow long discussions with comments:>50.