lmdeploy suppport parrllel embedding #4192

Tsundoku958 · 2025-12-09T09:50:26Z

Thanks for your contribution and we appreciate it a lot. The following instructions would make your pull request more healthy and more easily receiving feedbacks. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers.

Motivation

I noticed that the current lmdeploy does not use tensor parallelism for the embedding layer and lm_head, yet they consume nearly as much GPU memory as the linear layers.Maybe This PR adds support for tensor parallelism in the embedding layer.

Modification

The rowwise tensor parallelism for the embedding layer
Corresponding unit test files.

Perhaps TP (tensor parallelism) for embedding and lm_head could be enabled by default in lmdeploy, or a new args could be added to let users control whether to enable or disable embedding parallelism？
@grimoire @lvhan028

maoruihan and others added 6 commits December 9, 2025 06:30

support row parallel embedding

9a0d0b6

add test_embedding

6c65c69

fix test embedding

ca101ff

fix padding_idx

240ee7f

fix

04c1349

deepseek use parallel embedding

26666f5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

lmdeploy suppport parrllel embedding #4192

lmdeploy suppport parrllel embedding #4192

Uh oh!

Tsundoku958 commented Dec 9, 2025 •

edited

Loading

Labels

1 participant

lmdeploy suppport parrllel embedding #4192

Are you sure you want to change the base?

lmdeploy suppport parrllel embedding #4192

Uh oh!

Conversation

Tsundoku958 commented Dec 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Modification

Labels

1 participant

Tsundoku958 commented Dec 9, 2025 •

edited

Loading