[Attention][Bugfix] Fix FA sink support #28660

MatthewBonanni · 2025-11-13T14:42:58Z

Purpose

#24794 didn't mark FlashAttention as supporting sink, so GPT-OSS ends up selecting TRITON_ATTN, causing failure. This PR fixes the issue.

Test Plan

python examples/offline_inference/spec_decode.py \ --method "eagle3" \ --tp 1 \ --print-output \ --model-dir openai/gpt-oss-20b" \ --eagle-dir "RedHatAI/gpt-oss-20b-speculator.eagle3" \ --dataset_name "hf" \ --dataset_path "philschmid/mt-bench" \ --num-spec-tokens 3

Test Result

Previously failed, now works

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>

gemini-code-assist

Code Review

This pull request correctly addresses the bug where FlashAttention was not marked as supporting sink tokens, leading to incorrect backend selection for GPT-OSS models. The added supports_sink class method in FlashAttentionBackend properly checks for the availability of FlashAttention's variable-length function and its sink support. The implementation is straightforward and directly resolves the reported issue, ensuring that the correct attention backend is utilized.

Signed-off-by: Matthew Bonanni <mbonanni@redhat.com> Signed-off-by: George D. Torres <gdavtor@gmail.com>

Signed-off-by: Matthew Bonanni <mbonanni@redhat.com> Signed-off-by: Bram Wasti <bwasti@meta.com>

Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>

Signed-off-by: Matthew Bonanni <mbonanni@redhat.com> Signed-off-by: Xingyu Liu <charlotteliu12x@gmail.com>

Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>

implement supports_sink

e23aeb3

Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>

MatthewBonanni requested a review from LucasWilkinson as a code owner November 13, 2025 14:42

mergify bot added the v1 label Nov 13, 2025

gemini-code-assist bot reviewed Nov 13, 2025

View reviewed changes

mgoin approved these changes Nov 13, 2025

View reviewed changes

mgoin added ready ONLY add when PR is ready to merge/full CI is needed bug Something isn't working gpt-oss Related to GPT-OSS models labels Nov 13, 2025

github-project-automation bot added this to gpt-oss Issues & Enhancements Nov 13, 2025

github-project-automation bot moved this to To Triage in gpt-oss Issues & Enhancements Nov 13, 2025

mgoin merged commit f9f3b59 into vllm-project:main Nov 13, 2025
55 of 56 checks passed

github-project-automation bot moved this from To Triage to Done in gpt-oss Issues & Enhancements Nov 13, 2025

MatthewBonanni deleted the fa_sink branch November 13, 2025 18:23

geodavic pushed a commit to geodavic/vllm that referenced this pull request Nov 16, 2025

[Attention][Bugfix] Fix FA sink support (vllm-project#28660)

6492c64

Signed-off-by: Matthew Bonanni <mbonanni@redhat.com> Signed-off-by: George D. Torres <gdavtor@gmail.com>

bwasti pushed a commit to bwasti/vllm that referenced this pull request Nov 17, 2025

[Attention][Bugfix] Fix FA sink support (vllm-project#28660)

683210f

Signed-off-by: Matthew Bonanni <mbonanni@redhat.com> Signed-off-by: Bram Wasti <bwasti@meta.com>

devpatelio pushed a commit to SumanthRH/vllm that referenced this pull request Nov 29, 2025

[Attention][Bugfix] Fix FA sink support (vllm-project#28660)

57c69ee

Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>

charlotte12l pushed a commit to charlotte12l/vllm that referenced this pull request Dec 5, 2025

[Attention][Bugfix] Fix FA sink support (vllm-project#28660)

c883370

Signed-off-by: Matthew Bonanni <mbonanni@redhat.com> Signed-off-by: Xingyu Liu <charlotteliu12x@gmail.com>

Zhathw pushed a commit to Zhathw/vllm that referenced this pull request Dec 6, 2025

[Attention][Bugfix] Fix FA sink support (vllm-project#28660)

1d6936a

Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Attention][Bugfix] Fix FA sink support #28660

[Attention][Bugfix] Fix FA sink support #28660

Uh oh!

MatthewBonanni commented Nov 13, 2025 •

edited by github-actions bot

Loading

gemini-code-assist bot left a comment

Uh oh!

Labels

2 participants

Uh oh!

[Attention][Bugfix] Fix FA sink support #28660

[Attention][Bugfix] Fix FA sink support #28660

Uh oh!

Conversation

MatthewBonanni commented Nov 13, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Labels

2 participants

MatthewBonanni commented Nov 13, 2025 •

edited by github-actions bot

Loading