[Misc] Add check for dual_chunk_attention #24070

ZJY0516 · 2025-09-02T03:28:19Z

Purpose

FIX #24048
Add an early validation check for the DUAL_CHUNK_FLASH_ATTN environment variable when using the Qwen2.5-14B-Instruct-1M model. This preemptive check should trigger a clear error on startup rather than allowing the code to fail later and obscurely within the attention module.

Test Plan

Test Result

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: zjy0516 <[email protected]>

gemini-code-assist

Code Review

This pull request adds a validation check to ensure that VLLM_ATTENTION_BACKEND is set to DUAL_CHUNK_FLASH_ATTN for Qwen2.5 models, which is a great improvement for user experience by catching misconfigurations early. My feedback focuses on improving the robustness of this check by using a ValueError instead of an assert statement.

vllm/config/__init__.py

Signed-off-by: zjy0516 <[email protected]>

LucasWilkinson

Thanks for the contribution!

* 'main' of https://github.com/845473182/vllm: (457 commits) [BugFix] Fix routed_scaling_factor double mul for dots1 and glm4 MoE models (vllm-project#24132) [Misc] Add check for dual_chunk_attention (vllm-project#24070) [Doc]: fix typos in Python comments (vllm-project#24115) [Doc]: fix typos in Python comments (vllm-project#24093) [Compile] Fix Compile Warning for `w4a8_mm_entry.cu` (vllm-project#23660) fix some typos (vllm-project#24071) [V1] Wrapper which plumbs request-level logits processors into vLLM batch-level logits processing (vllm-project#23656) Upgrade xgrammar to 0.1.23 (vllm-project#22988) Update release pipeline post PyTorch 2.8.0 update (vllm-project#24073) [XPU] Fix the bug of LoRA logits on the XPU platform (vllm-project#24081) [CI/Build] Disable SiluMul NVFP4 quant fusion tests (vllm-project#24121) [Bug] R1 Accuracy: Fix `routed_scaling_factor` Double Mul Issue (vllm-project#24119) [AMD][Kernel][Bugfix] Cast offsets tensor bn to tl.int64 to avoid GPU segfault (vllm-project#23692) [CI] Enable all hf transformers baselines in test_hybrid (vllm-project#23936) [Log] Only Print Profiler Results on Rank 0 (vllm-project#23370) Fix weights loading for Apertus (vllm-project#24100) [Metrics] Deprecate TPOT in favor of ITL (vllm-project#24110) [Bugfix] Fix packed_factor missing attribute error (vllm-project#23902) Run ruff format on a few files. (vllm-project#24075) [Bugfix] Fix transform_config parsing in Compressed Tensors (vllm-project#23945) ...

Signed-off-by: zjy0516 <[email protected]> Signed-off-by: Shiyan Deng <[email protected]>

Signed-off-by: zjy0516 <[email protected]>

Signed-off-by: zjy0516 <[email protected]> Signed-off-by: LopezCastroRoberto <[email protected]>

Signed-off-by: zjy0516 <[email protected]> Signed-off-by: bruceszchen <[email protected]>

add assertion for dual_chunk_attention

4dc9362

Signed-off-by: zjy0516 <[email protected]>

ZJY0516 requested review from simon-mo, WoosukKwon, youkaichao, robertgshaw2-redhat, mgoin, tlrmchlsmth, houseroad, hmellor, yewentao256 and ProExpertProg as code owners September 2, 2025 03:28

gemini-code-assist bot reviewed Sep 2, 2025

View reviewed changes

vllm/config/__init__.py Outdated Show resolved Hide resolved

raise value error

f36dfd2

Signed-off-by: zjy0516 <[email protected]>

LucasWilkinson approved these changes Sep 3, 2025

View reviewed changes

LucasWilkinson enabled auto-merge (squash) September 3, 2025 01:50

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Sep 3, 2025

LucasWilkinson merged commit e81d4e6 into vllm-project:main Sep 3, 2025
46 checks passed

ZJY0516 deleted the dual_chunk_attention branch September 3, 2025 04:39

842974287 pushed a commit to 842974287/vllm that referenced this pull request Sep 3, 2025

[Misc] Add check for dual_chunk_attention (vllm-project#24070)

aaa2142

Signed-off-by: zjy0516 <[email protected]> Signed-off-by: Shiyan Deng <[email protected]>

eicherseiji pushed a commit to eicherseiji/vllm that referenced this pull request Sep 9, 2025

[Misc] Add check for dual_chunk_attention (vllm-project#24070)

b524198

Signed-off-by: zjy0516 <[email protected]>

LopezCastroRoberto pushed a commit to LopezCastroRoberto/vllm that referenced this pull request Sep 11, 2025

[Misc] Add check for dual_chunk_attention (vllm-project#24070)

1e53664

Signed-off-by: zjy0516 <[email protected]> Signed-off-by: LopezCastroRoberto <[email protected]>

cboss6 pushed a commit to cboss6/vllm that referenced this pull request Sep 16, 2025

[Misc] Add check for dual_chunk_attention (vllm-project#24070)

a5f2445

Signed-off-by: zjy0516 <[email protected]> Signed-off-by: bruceszchen <[email protected]>

cboss6 pushed a commit to cboss6/vllm that referenced this pull request Sep 16, 2025

[Misc] Add check for dual_chunk_attention (vllm-project#24070)

78bd99a

Signed-off-by: zjy0516 <[email protected]> Signed-off-by: bruceszchen <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Misc] Add check for dual_chunk_attention #24070

[Misc] Add check for dual_chunk_attention #24070

Uh oh!

ZJY0516 commented Sep 2, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

LucasWilkinson left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

[Misc] Add check for dual_chunk_attention #24070

[Misc] Add check for dual_chunk_attention #24070

Uh oh!

Conversation

ZJY0516 commented Sep 2, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

LucasWilkinson left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

ZJY0516 commented Sep 2, 2025 •

edited by github-actions bot

Loading