Enable headless models for pooling in the Transformers backend #21767

hmellor · 2025-07-28T15:07:47Z

Previously, embedding model checkpoints that had their layers at the root of the checkpoint would not load correctly with the Transformers backend.

This PR enables the loading of Transformers base model classes.

Now, both of the following formats of checkpoint will work for pooling tasks:

ModelForCausalLM:

- model - layers - ... - lm_head

Model:

- layers - ... - ...

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

github-actions · 2025-07-28T15:07:56Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

gemini-code-assist

Code Review

This pull request enables loading headless models for embedding with the Transformers backend, which is a great addition. The changes in the configuration and registry look correct, and the new test case covers the intended scenarios.

However, I've found a critical issue in the WeightsMapper implementation for the new TransformersModel. The current logic for prefixing weights is flawed due to incorrect key ordering in the dictionary, which will cause weight loading to fail for one of the model formats this PR aims to support. I've provided a detailed comment and a code suggestion to fix this.

vllm/model_executor/models/transformers.py

tests/models/test_transformers.py

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

DarkLight1337

Thanks for extending this support!

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

hmellor · 2025-07-29T17:13:38Z

The mapper is having issues, I'll disable auto-merge for now

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

…project#21767) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

…project#21767) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Signed-off-by: Jinzhen Lin <linjinzhen@hotmail.com>

…project#21767) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Signed-off-by: Noam Gat <noamgat@gmail.com>

…project#21767) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Signed-off-by: Paul Pak <paulpak58@gmail.com>

…project#21767) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Signed-off-by: Diego-Castan <diego.castan@ibm.com>

…project#21767) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

Enable headless models for embedding in the Transformers backend

09c8265

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

hmellor requested review from DarkLight1337, WoosukKwon, houseroad, mgoin, robertgshaw2-redhat, simon-mo, tlrmchlsmth, youkaichao and ywang96 as code owners July 28, 2025 15:07

hmellor changed the title ~~Enable headless models for embedding in the Transformers backend~~ Enable headless models for pooling in the Transformers backend Jul 28, 2025

mergify bot added the new-model Requests to new models label Jul 28, 2025

hmellor requested a review from Isotr0py July 28, 2025 15:08

gemini-code-assist bot reviewed Jul 28, 2025

View reviewed changes

vllm/model_executor/models/transformers.py Show resolved Hide resolved

DarkLight1337 reviewed Jul 28, 2025

View reviewed changes

tests/models/test_transformers.py Outdated Show resolved Hide resolved

Extract check for use of transformers backend to method

79c0edf

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

Isotr0py approved these changes Jul 28, 2025

View reviewed changes

DarkLight1337 approved these changes Jul 28, 2025

View reviewed changes

DarkLight1337 enabled auto-merge (squash) July 28, 2025 15:25

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Jul 28, 2025

hmellor mentioned this pull request Jul 29, 2025

[Model] Re-add the implicit conversion feature for as_seq_cls_model #21103

Merged

4 tasks

Fix mapper for TransformersModel

05e4574

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

hmellor disabled auto-merge July 29, 2025 17:13

hmellor added 5 commits July 30, 2025 19:38

Merge branch 'main' into transformers-backend-base-model-loading

637ef58

Merge branch 'main' into transformers-backend-base-model-loading

91472e1

Merge branch 'main' into transformers-backend-base-model-loading

006ca32

Merge branch 'main' into transformers-backend-base-model-loading

2e6afee

Fix test registry coverage

a58d5ee

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

hmellor enabled auto-merge (squash) August 1, 2025 16:54

DarkLight1337 approved these changes Aug 1, 2025

View reviewed changes

vllm-bot merged commit 38c8bce into vllm-project:main Aug 1, 2025
41 of 44 checks passed

hmellor deleted the transformers-backend-base-model-loading branch August 2, 2025 08:59

npanpaliya pushed a commit to odh-on-pz/vllm-upstream that referenced this pull request Aug 6, 2025

Enable headless models for pooling in the Transformers backend (vllm-…

5181482

…project#21767) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

epwalsh pushed a commit to epwalsh/vllm that referenced this pull request Aug 28, 2025

Enable headless models for pooling in the Transformers backend (vllm-…

29523fd

…project#21767) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

zhewenl pushed a commit to zhewenl/vllm that referenced this pull request Aug 28, 2025

Enable headless models for pooling in the Transformers backend (vllm-…

f6ac4ab

…project#21767) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

hmellor added this to Transformers backend Sep 24, 2025

hmellor moved this to Done in Transformers backend Sep 24, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Enable headless models for pooling in the Transformers backend #21767

Enable headless models for pooling in the Transformers backend #21767

Uh oh!

hmellor commented Jul 28, 2025

github-actions bot commented Jul 28, 2025

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

DarkLight1337 left a comment

hmellor commented Jul 29, 2025

Uh oh!

Labels

4 participants

Uh oh!

Enable headless models for pooling in the Transformers backend #21767

Enable headless models for pooling in the Transformers backend #21767

Uh oh!

Conversation

hmellor commented Jul 28, 2025

github-actions bot commented Jul 28, 2025

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

DarkLight1337 left a comment

Choose a reason for hiding this comment

hmellor commented Jul 29, 2025

Uh oh!

Labels

4 participants