forked from vllm-project/vllm
- Notifications
You must be signed in to change notification settings - Fork 134
Pull requests: HabanaAI/vllm-fork
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Optimized hpu_graph of bert models to improve the embedding performance.
#2111 opened Nov 1, 2025 by gyou2021 Loading…
Improve weights loading and fp8 range conversion on Gaudi2
#2108 opened Oct 30, 2025 by yangulei Loading…
Multiple model engine and launch script to support qwen3 reranker
#2106 opened Oct 30, 2025 by tinafengfun Loading…
compose small seqlen sdpa for qwen2vl and qwen2.5vl
#2102 opened Oct 29, 2025 by yingjie-han Loading…
Workaround for Assertion error when embedding with bge-m3 in lazy mode
#2093 opened Oct 28, 2025 by slokesha Loading…
add draft version of vllm inference document for v1.22.0
#2082 opened Oct 24, 2025 by heyuanliu-intel Loading…
3 tasks
fix bug that VLLM_SKIP_WARMUP=1 is not recognized in vision_bucket
#2036 opened Oct 15, 2025 by yingjie-han Loading…
Previous Next
ProTip! Exclude everything labeled
bug with -label:bug.