Skip to content

Tags: vllm-project/vllm

Tags

v0.14.0rc0

Toggle v0.14.0rc0's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
[Bugfix] Fix tool_choice="none" being ignored by GPT-OSS/harmony mode… …ls (#30867) Signed-off-by: yujiepu <pyjapple@gmail.com> Signed-off-by: PlatinumGod <pyjapple@gmail.com> Co-authored-by: Chauncey <chaunceyjiang@gmail.com>

v0.13.0

Toggle v0.13.0's commit message
Check for truthy `rope_parameters` not the existence of it (#30983) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> (cherry picked from commit 19c5833)

v0.13.0rc4

Toggle v0.13.0rc4's commit message
[v1] Add PrefixLM support to TritonAttention backend (#30386) (cherry picked from commit 74a1ac3)

v0.13.0rc3

Toggle v0.13.0rc3's commit message
[XPU] fix broken fp8 online quantization for XPU platform (#30831) Signed-off-by: Yan Ma <yan.ma@intel.com> (cherry picked from commit 4f735ba)

v0.13.0rc2

Toggle v0.13.0rc2's commit message
[ROCm] [Bugfix] Fix torch sdpa hallucination (#30789) Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com> (cherry picked from commit 2410132)

v0.13.0rc1

Toggle v0.13.0rc1's commit message
fake rc release to fix nightly wheels 

v0.12.0

Toggle v0.12.0's commit message
[Core] Rename PassConfig flags as per RFC #27995 (#29646) Signed-off-by: arpitkh101 <arpit5khandelwal@gmail.com> Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com> (cherry picked from commit d7284a2)

v0.11.2

Toggle v0.11.2's commit message
[BugFix] Fix false assertion with spec-decode=[2,4,..] and TP>2 (#29036) Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com> (cherry picked from commit 8f4f77a)

v0.11.1

Toggle v0.11.1's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
[BugFix] Fix PP/async scheduling with pooling models (#28899) Signed-off-by: Nick Hill <nhill@redhat.com> Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>

v0.11.1rc7

Toggle v0.11.1rc7's commit message
[compile] Enable sequence parallelism matching w/o custom ops enabled ( …#27126) Signed-off-by: angelayi <yiangela7@gmail.com> Signed-off-by: Luka Govedič <ProExpertProg@users.noreply.github.com> Signed-off-by: ProExpertProg <lgovedic@redhat.com> Co-authored-by: Luka Govedič <lgovedic@redhat.com> Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com> Co-authored-by: Luka Govedič <luka.govedic@gmail.com> (cherry picked from commit f36292d)