HabanaAI / vllm-fork Public

forked from vllm-project/vllm

Notifications You must be signed in to change notification settings
Fork 134
Star 84

Code
Issues 12
Pull requests 73
Discussions
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Pull requests: HabanaAI/vllm-fork

Labels 19 Milestones 0

New pull request New

73 Open 1,935 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

update mineru 1.5.4 doc

#2113 opened Nov 3, 2025 by yingjie-han

Loading…

add mineru doc

#2112 opened Nov 3, 2025 by yingjie-han

Loading…

Optimized hpu_graph of bert models to improve the embedding performance.

#2111 opened Nov 1, 2025 by gyou2021

Loading…

Fix ray init params

#2110 opened Oct 30, 2025 by michalkuligowski

Loading…

Improve weights loading and fp8 range conversion on Gaudi2

#2108 opened Oct 30, 2025 by yangulei

Loading…

Multiple model engine and launch script to support qwen3 reranker

#2106 opened Oct 30, 2025 by tinafengfun

Loading…

compose small seqlen sdpa for qwen2vl and qwen2.5vl

#2102 opened Oct 29, 2025 by yingjie-han

Loading…

Fix no attr enable_server_load_tracking error

#2097 opened Oct 28, 2025 by shepark

Loading…

Add max_pixels option.

#2094 opened Oct 28, 2025 by wenbinc-Bin

Loading…

Workaround for Assertion error when embedding with bge-m3 in lazy mode

#2093 opened Oct 28, 2025 by slokesha

Loading…

Move only the quantized model and tensors to HPU

#2091 opened Oct 27, 2025 by yangulei

Loading…

add draft version of vllm inference document for v1.22.0

#2082 opened Oct 24, 2025 by heyuanliu-intel

Loading…

3 tasks

Add dotsocr

#2077 opened Oct 23, 2025 by tianyuan211

Loading…

fix wrong section for Qwen series doc

#2074 opened Oct 23, 2025 by heyuanliu-intel

Loading…

3 tasks

Enable chunked prefill on aice 1.22

#2070 opened Oct 23, 2025 by YuJiankang

Loading…

refactor(hpu_model_runner): restructure multimodal-related code

#2066 opened Oct 22, 2025 by Jing1Ling • Draft

3 tasks

Slokesha port ovis

#2063 opened Oct 21, 2025 by slokesha • Draft

3 tasks

[CS-1549] Eanble function call DeepSeek-V3.1

#2047 opened Oct 19, 2025 by JianyuLi01

Loading…

Porting_ovis

#2044 opened Oct 16, 2025 by SupreetSinghPalne • Draft

3 tasks

Spalne/porting ovis

#2038 opened Oct 16, 2025 by SupreetSinghPalne • Draft

3 tasks

replaced apply_rotary_emb_torch() with rotary_embedding imp

#2037 opened Oct 15, 2025 by slokesha • Draft

3 tasks

fix bug that VLLM_SKIP_WARMUP=1 is not recognized in vision_bucket

#2036 opened Oct 15, 2025 by yingjie-han

Loading…

Fix cache miss for Ovis2.5

#2035 opened Oct 15, 2025 by Jianhong-Zhang • Draft

Fix cache miss for InternVL

#2034 opened Oct 15, 2025 by Jianhong-Zhang • Draft

Keep grids tensor on CPU in multimodal kwargs

#2019 opened Oct 10, 2025 by slokesha • Draft

3 tasks

Previous 1 2 3 Next

Previous Next

ProTip! Exclude everything labeled bug with -label:bug.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!