[New Model]mBART model #22883

princepride · 2025-08-14T08:04:14Z

Purpose

This PR introduces support for the mBART model. This is a necessary prerequisite for the implementation of the Dolphin model, which is tracked in issue #18850, as Dolphin requires mBART for its decoder.

The main difference from the existing BART model is that mBART adjusts the position of the LayerNorm layer within its architecture. The encoder_decoder.py example has also been updated to handle both BART and the new mBART model, ensuring backward compatibility.

Test Plan

The implementation was verified using the offline inference example for both the new mBART model and the existing BART model.

Test mBART support:

python examples/offline_inference/encoder_decoder.py --model mbart

Test BART backward compatibility:

python examples/offline_inference/encoder_decoder.py --model bart

Test Result

Both test commands executed successfully and produced the expected outputs, confirming that the mBART implementation is correct and that backward compatibility with BART is maintained.

mBART Test Result:

BART Test Result:

Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>

gemini-code-assist

Code Review

This pull request introduces support for the MBart model, which is a prerequisite for the Dolphin model. The changes include adding the MBart model implementation, updating the model registry, and refactoring the encoder-decoder example script to support both BART and MBart. The implementation correctly captures the architectural differences between BART and MBart, specifically the pre-norm layer normalization.

My review focuses on the new MBartForConditionalGeneration implementation. I found a critical issue in the weight loading logic that could lead to incorrect model loading due to its reliance on the order of weights in the checkpoint. I've provided a suggestion to make it more robust.

vllm/model_executor/models/bart.py

Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>

github-actions · 2025-08-14T08:14:43Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

noooop

Please Add this model to tests/models/registry.py

Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>

noooop

cc @DarkLight1337

princepride · 2025-08-14T09:38:12Z

cc @Isotr0py Can you help me review it, thank you. 😊

Isotr0py

Some initial comments, will take a deeper look later tonight.

vllm/model_executor/models/bart.py

examples/offline_inference/encoder_decoder.py

docs/models/supported_models.md

Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>

princepride · 2025-08-15T03:27:58Z

@Isotr0py @DarkLight1337 hi, can you help review it, thank you.

princepride · 2025-08-15T07:00:41Z

@DarkLight1337 Can you review it again? thank you.

DarkLight1337

Thanks, can you add a test to check the correctness against HF?

princepride · 2025-08-15T08:58:10Z

@DarkLight1337

Test Code :

import torch from transformers import MBartForConditionalGeneration, MBart50TokenizerFast from vllm import LLM, SamplingParams from vllm.inputs import ExplicitEncoderDecoderPrompt, TokensPrompt # --- 1. General Setup --- MODEL_ID = "facebook/mbart-large-en-ro" # List of articles to test for translation TEST_ARTICLES = [ "UN Chief Says There Is No Military Solution in Syria", "Engineers are building a new bridge across the river.", "Data science is a multidisciplinary field that uses scientific methods.", ] SAMPLING_PARAMS = SamplingParams(temperature=0, max_tokens=64) results = [] # To store the outputs for final comparison print("--- Initializing Models and Tokenizers ---") # --- 2. Initialize Hugging Face Model --- print("Loading Hugging Face model...") # Using MBart50TokenizerFast is more specific for this model family hf_tokenizer = MBart50TokenizerFast.from_pretrained(MODEL_ID, src_lang="en_XX") hf_model = MBartForConditionalGeneration.from_pretrained( MODEL_ID, torch_dtype=torch.bfloat16, device_map="auto" ) print("Hugging Face model loaded successfully.") # --- 3. Initialize vLLM Engine --- print("Loading vLLM engine...") # The hf_overrides parameter can help vLLM resolve the correct architecture if the config is ambiguous llm = LLM(model=MODEL_ID, dtype="bfloat16", hf_overrides={"architectures": ["MBartForConditionalGeneration"]}) print("vLLM engine loaded successfully.\n") # --- 4. Process Each Article and Generate Outputs --- for i, article in enumerate(TEST_ARTICLES): print(f"--- Processing Test Case {i+1}/{len(TEST_ARTICLES)} ---") print(f"Input: {article}") # --- A. Hugging Face Inference --- inputs = hf_tokenizer(article, return_tensors="pt").to(hf_model.device) # The key step: specifying the target language for the decoder translated_tokens = hf_model.generate( **inputs, decoder_start_token_id=hf_tokenizer.lang_code_to_id["ro_RO"], max_new_tokens=64 ) hf_output = hf_tokenizer.batch_decode(translated_tokens, skip_special_tokens=True)[0] print(f"HF Output: {hf_output}") # --- B. vLLM Inference --- encoder_prompt = article # The equivalent step in vLLM: providing the target language token as the decoder's prompt decoder_prompt_token_ids = [hf_tokenizer.lang_code_to_id["ro_RO"]] vllm_prompt = ExplicitEncoderDecoderPrompt( encoder_prompt=encoder_prompt, decoder_prompt=TokensPrompt(prompt_token_ids=decoder_prompt_token_ids) ) outputs = llm.generate([vllm_prompt], SAMPLING_PARAMS) vllm_output = outputs[0].outputs[0].text print(f"vLLM Output: {vllm_output}\n") # --- C. Store results for the final summary --- results.append({ "original": article, "hf_output": hf_output, "vllm_output": vllm_output }) # --- 5. Final Comparison Table --- print("\n" + "="*80) print(" " * 28 + "FINAL COMPARISON SUMMARY") print("="*80 + "\n") for i, res in enumerate(results): print(f"--- Test Case {i+1} ---") print(f"Original Text: {res['original']}") print(f"HF Output: {res['hf_output']}") print(f"vLLM Output: {res['vllm_output']}") # Verify if the outputs are identical is_match = res['hf_output'] == res['vllm_output'] print(f"Outputs Match: {is_match}") print("-" * 50)

Result :

noooop · 2025-08-15T09:09:57Z

Please refer to https://github.com/vllm-project/vllm/blob/main/tests/models/language/generation/test_bart.py

The purpose of this test is to prevent someone from accidentally breaking mBART.

DarkLight1337 · 2025-08-15T09:29:13Z

Yeah, make sure the test is run in CI

Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>

princepride · 2025-08-16T04:50:06Z

@DarkLight1337 @noooop I have add CI of mbart based on test_bart.py, please review it, thank you.

Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>

DarkLight1337

Thanks, LGTM!

princepride · 2025-08-16T09:41:01Z

@DarkLight1337 The failed checks seem to be unrelated to my code.

Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com> Signed-off-by: Yiwen Chen <yiwen66@berkeley.edu>

Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>

Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com> Signed-off-by: Duncan Moss <djm.moss@gmail.com>

Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>

Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com> Signed-off-by: Xiao Yu <xiao.yu@amd.com>

Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>

princepride added 3 commits August 14, 2025 14:56

xxx

cef40c5

Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>

xxx

2ae8c4d

Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>

xxx

5959b13

Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>

princepride requested a review from hmellor as a code owner August 14, 2025 08:04

mergify bot added documentation Improvements or additions to documentation new-model Requests to new models labels Aug 14, 2025

gemini-code-assist bot reviewed Aug 14, 2025

View reviewed changes

vllm/model_executor/models/bart.py Outdated Show resolved Hide resolved

princepride added 2 commits August 14, 2025 16:08

xxx

b88c133

Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>

xxx

0cb7f01

Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>

princepride changed the title ~~[New Model]Mbart model~~ [New Model]mBART model Aug 14, 2025

noooop reviewed Aug 14, 2025

View reviewed changes

add model to pytest

bc2d931

Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>

princepride requested review from DarkLight1337 and ywang96 as code owners August 14, 2025 09:10

princepride requested a review from noooop August 14, 2025 09:11

noooop reviewed Aug 14, 2025

View reviewed changes

Isotr0py reviewed Aug 14, 2025

View reviewed changes

vllm/model_executor/models/bart.py Outdated Show resolved Hide resolved

vllm/model_executor/models/bart.py Outdated Show resolved Hide resolved

DarkLight1337 reviewed Aug 14, 2025

View reviewed changes

examples/offline_inference/encoder_decoder.py Outdated Show resolved Hide resolved

DarkLight1337 reviewed Aug 14, 2025

View reviewed changes

docs/models/supported_models.md Show resolved Hide resolved

princepride added 8 commits August 14, 2025 22:39

xxx

fcfde31

Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>

xxx

b694fc3

Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>

xxx

50de811

Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>

xxx

d70430f

Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>

xxx

e4cf0b4

Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>

xxx

1b359eb

Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>

xxx

b6b84cb

Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>

xxx

2d4f581

Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>

princepride requested a review from DarkLight1337 August 15, 2025 06:02

DarkLight1337 reviewed Aug 15, 2025

View reviewed changes

princepride requested a review from DarkLight1337 August 15, 2025 08:59

add test_mbart.py

11ae475

Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>

princepride requested a review from noooop August 16, 2025 04:47

xxx

311f59f

Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>

DarkLight1337 approved these changes Aug 16, 2025

View reviewed changes

DarkLight1337 enabled auto-merge (squash) August 16, 2025 06:26

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Aug 16, 2025

Isotr0py approved these changes Aug 16, 2025

View reviewed changes

Merge branch 'main' into mbart-model

6242434

DarkLight1337 merged commit 829bbd7 into vllm-project:main Aug 16, 2025
43 checks passed

666even666 pushed a commit to 666even666/vllm that referenced this pull request Aug 18, 2025

[New Model]mBART model (vllm-project#22883)

b0d84ed

Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com> Signed-off-by: Yiwen Chen <yiwen66@berkeley.edu>

simon-mo mentioned this pull request Aug 19, 2025

[Doc]: Missing Mamba1 and Jamba V1 support in Release v0.10.1 Highlights #23170

Closed

1 task

divakar-amd pushed a commit to divakar-amd/vllm_upstream that referenced this pull request Aug 20, 2025

[New Model]mBART model (vllm-project#22883)

a7ec9e0

Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>

djmmoss pushed a commit to djmmoss/vllm that referenced this pull request Aug 21, 2025

[New Model]mBART model (vllm-project#22883)

0025ac6

Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com> Signed-off-by: Duncan Moss <djm.moss@gmail.com>

epwalsh pushed a commit to epwalsh/vllm that referenced this pull request Aug 28, 2025

[New Model]mBART model (vllm-project#22883)

3e7e5bf

Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>

xiao-llm pushed a commit to xiao-llm/vllm that referenced this pull request Aug 28, 2025

[New Model]mBART model (vllm-project#22883)

8256640

Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com> Signed-off-by: Xiao Yu <xiao.yu@amd.com>

zhewenl pushed a commit to zhewenl/vllm that referenced this pull request Aug 28, 2025

[New Model]mBART model (vllm-project#22883)

7a1d2f4

Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[New Model]mBART model #22883

[New Model]mBART model #22883

Uh oh!

princepride commented Aug 14, 2025 •

edited by github-actions bot

Loading

gemini-code-assist bot left a comment

Uh oh!

github-actions bot commented Aug 14, 2025

noooop left a comment •

edited

Loading

noooop left a comment

princepride commented Aug 14, 2025

Isotr0py left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

princepride commented Aug 15, 2025

princepride commented Aug 15, 2025

DarkLight1337 left a comment

princepride commented Aug 15, 2025

noooop commented Aug 15, 2025

DarkLight1337 commented Aug 15, 2025

princepride commented Aug 16, 2025

DarkLight1337 left a comment

princepride commented Aug 16, 2025

Uh oh!

Labels

4 participants

Uh oh!

[New Model]mBART model #22883

[New Model]mBART model #22883

Uh oh!

Conversation

princepride commented Aug 14, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

github-actions bot commented Aug 14, 2025

noooop left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

noooop left a comment

Choose a reason for hiding this comment

princepride commented Aug 14, 2025

Isotr0py left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

princepride commented Aug 15, 2025

princepride commented Aug 15, 2025

DarkLight1337 left a comment

Choose a reason for hiding this comment

princepride commented Aug 15, 2025

Test Code :

Result :

noooop commented Aug 15, 2025

DarkLight1337 commented Aug 15, 2025

princepride commented Aug 16, 2025

DarkLight1337 left a comment

Choose a reason for hiding this comment

princepride commented Aug 16, 2025

Uh oh!

Labels

4 participants

princepride commented Aug 14, 2025 •

edited by github-actions bot

Loading

noooop left a comment •

edited

Loading