support cp&sp #2961

LookAround0301 · 2025-09-16T11:10:13Z

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

vLLM version: v0.10.2
vLLM main: vllm-project/vllm@1a0a04d

Signed-off-by: LookAround <[email protected]>

Signed-off-by: weiguihua2 <[email protected]>

Signed-off-by: tanwenqin <[email protected]>

Signed-off-by: zhaoyifan <[email protected]>

Signed-off-by: zhangsicheng5 <[email protected]>

Signed-off-by: LookAround <[email protected]>

…nto long_seq_pr # Conflicts: # vllm_ascend/attention/attention_v1.py # vllm_ascend/ops/fused_moe.py # vllm_ascend/worker/model_runner_v1.py

Signed-off-by: LookAround <[email protected]>

github-actions · 2025-09-16T11:10:20Z

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

A PR should do only one thing, smaller PRs enable faster reviews.
Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

gemini-code-assist

Code Review

This pull request introduces support for context parallelism (CP) and sequence parallelism (SP) to enable efficient inference for models with long sequences on Ascend NPUs. The changes are extensive, touching upon the scheduler, model runner, attention implementations (vanilla and MLA), and various operational layers. New metadata structures and logic have been added to manage the distributed state across CP and SP ranks. The implementation appears to leverage hardware-specific features for ring attention and parallel computations. Overall, this is a significant feature addition that enhances long-context capabilities. My review has identified a critical issue related to state management when reordering requests, which needs to be addressed.

vllm_ascend/worker/npu_input_batch.py

…nto long_seq_pr # Conflicts: # vllm_ascend/models/deepseek_v2.py

Signed-off-by: LookAround <[email protected]>

github-actions · 2025-09-16T14:35:55Z

This pull request has conflicts, please resolve those before we can evaluate the pull request.

Signed-off-by: Apocalypse990923-qshi <[email protected]>

fix lint (part) + long_sequence_enable

Signed-off-by: Apocalypse990923-qshi <[email protected]>

merge remote-track main

Signed-off-by: weiguihua2 <[email protected]>

Signed-off-by: LookAround <[email protected]>

Signed-off-by: Apocalypse990923-qshi <[email protected]>

fix lint + mypy check

Signed-off-by: weiguihua2 <[email protected]>

Signed-off-by: LookAround <[email protected]>

Signed-off-by: SunnyLee219 <[email protected]>

Signed-off-by: weiguihua2 <[email protected]>

Signed-off-by: LookAround <[email protected]>

Signed-off-by: weiguihua2 <[email protected]>

Signed-off-by: SunnyLee219 <[email protected]>

Signed-off-by: weiguihua2 <[email protected]>

Signed-off-by: LookAround <[email protected]>

Signed-off-by: weiguihua2 <[email protected]>

github-actions · 2025-09-18T06:07:45Z

This pull request has conflicts, please resolve those before we can evaluate the pull request.

Signed-off-by: tanwenqin <[email protected]>

MengqingCao · 2025-09-18T08:53:50Z

plz make more details in pr message and link the related vllm pr here

github-actions · 2025-09-18T09:08:29Z

This pull request has conflicts, please resolve those before we can evaluate the pull request.

shen-shanshan · 2025-09-18T12:09:13Z

vllm_ascend/worker/model_runner_v1.py

+                tensor_npu = _list_to_tensor(value, self.device)
+                self.kv_idx_names[key] = tensor_npu
+
+            # 处理序列长度相关张量


It's better to use English comments.

Signed-off-by: Delphine-Nic <[email protected]>

Merge main to 79a910e

LookAround0301 and others added 11 commits September 16, 2025 10:42

[long_seq_optim] support cp&sp

2e4751b

Signed-off-by: LookAround <[email protected]>

support cp and sp

f33d6fe

Signed-off-by: weiguihua2 <[email protected]>

Supports Qwen3 CP

541c858

Signed-off-by: tanwenqin <[email protected]>

cleancode

7c51cc7

Signed-off-by: tanwenqin <[email protected]>

scheduler for cpsp

b94a21d

Signed-off-by: zhaoyifan <[email protected]>

model_runner for cpsp

9daafa4

Signed-off-by: zhangsicheng5 <[email protected]>

[long_seq_optim] add env for cp&sp

483dcae

Signed-off-by: LookAround <[email protected]>

[long_seq_optim] bug fix

4953636

Signed-off-by: LookAround <[email protected]>

Merge remote-tracking branch 'refs/remotes/origin_vllm-ascend/main' i…

e84586c

…nto long_seq_pr # Conflicts: # vllm_ascend/attention/attention_v1.py # vllm_ascend/ops/fused_moe.py # vllm_ascend/worker/model_runner_v1.py

[long_seq_optim] bug fix

30e7763

Signed-off-by: LookAround <[email protected]>

[long_seq_optim] bug fix

10ce6be

Signed-off-by: LookAround <[email protected]>

github-actions bot added module:tests module:ops module:core labels Sep 16, 2025

gemini-code-assist bot reviewed Sep 16, 2025

View reviewed changes

vllm_ascend/worker/npu_input_batch.py Show resolved Hide resolved

Merge remote-tracking branch 'refs/remotes/origin_vllm-ascend/main' i…

1977acf

…nto long_seq_pr # Conflicts: # vllm_ascend/models/deepseek_v2.py

LookAround0301 force-pushed the long_seq_pr branch from 45e9cda to 1977acf Compare September 16, 2025 14:16

[long_seq_optim] support torch_air

9e89af0

Signed-off-by: LookAround <[email protected]>

github-actions bot added the merge-conflicts label Sep 16, 2025

Apocalypse990923-qshi and others added 4 commits September 17, 2025 11:35

fix lint (part) + long_sequence_enable

451904d

Signed-off-by: Apocalypse990923-qshi <[email protected]>

Merge pull request #19 from Apocalypse990923-qshi/long_seq_pr_new

da486e0

fix lint (part) + long_sequence_enable

merge main

c2ba309

Signed-off-by: Apocalypse990923-qshi <[email protected]>

Merge pull request #20 from Apocalypse990923-qshi/long_seq_pr_new

524e258

merge remote-track main

github-actions bot removed the merge-conflicts label Sep 17, 2025

weiguihua2 and others added 4 commits September 17, 2025 14:38

modify mla op

2bd52e2

Signed-off-by: weiguihua2 <[email protected]>

[long_seq_optim] fix jingdu bug

206cd86

Signed-off-by: LookAround <[email protected]>

fix lint + mypy check

3fef61f

Signed-off-by: Apocalypse990923-qshi <[email protected]>

Merge pull request #21 from Apocalypse990923-qshi/long_seq_pr_new

1c0c871

fix lint + mypy check

weiguihua2 and others added 14 commits September 17, 2025 22:53

clean code isort

e84108c

Signed-off-by: weiguihua2 <[email protected]>

[long_seq_optim] fix deepseek e2e bug

9046cf3

Signed-off-by: LookAround <[email protected]>

Merge remote-tracking branch 'origin/long_seq_pr' into long_seq_pr

c24ff93

fix lint

85dcc1d

Signed-off-by: SunnyLee219 <[email protected]>

fix ut

16d9e69

Signed-off-by: SunnyLee219 <[email protected]>

clean code isort

2f3021a

Signed-off-by: weiguihua2 <[email protected]>

[long_seq_optim] fix qwen bug

7f42364

Signed-off-by: LookAround <[email protected]>

Merge remote-tracking branch 'origin/long_seq_pr' into long_seq_pr

e81fd31

clean code isort

e57e457

Signed-off-by: weiguihua2 <[email protected]>

fix ut

062d227

Signed-off-by: SunnyLee219 <[email protected]>

clean code isort

dc0da5d

Signed-off-by: weiguihua2 <[email protected]>

[long_seq_optim] fix qwen bug

8bdac79

Signed-off-by: LookAround <[email protected]>

Merge remote-tracking branch 'origin/long_seq_pr' into long_seq_pr

9ecd77c

clean code isort

6e782dc

Signed-off-by: weiguihua2 <[email protected]>

wangxiyuan added ready read for review ready-for-test start test by label for PR labels Sep 18, 2025

github-actions bot added merge-conflicts and removed ready read for review labels Sep 18, 2025

wangxiyuan mentioned this pull request Sep 18, 2025

[Feat] DP Rebalancing for DeepSeekV2 MLA Prefill #2493

Open

[bugfix] Qwen3 support sp

566e57f

Signed-off-by: tanwenqin <[email protected]>

github-actions bot removed the merge-conflicts label Sep 18, 2025

LookAround0301 force-pushed the long_seq_pr branch from ae99fb6 to 566e57f Compare September 18, 2025 09:08

github-actions bot added the merge-conflicts label Sep 18, 2025

shen-shanshan reviewed Sep 18, 2025

View reviewed changes

zhangsicheng5 and others added 3 commits September 18, 2025 22:08

Merge remote-tracking branch 'upstream/main' into long_seq_pr

f43bc83

[bugfix] Qwen support cp&sp

f235594

Signed-off-by: Delphine-Nic <[email protected]>

Merge pull request #25 from zhangsicheng5/long_seq_pr

2f5102f

Merge main to 79a910e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

support cp&sp #2961

support cp&sp #2961

Uh oh!

LookAround0301 commented Sep 16, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Sep 16, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

github-actions bot commented Sep 16, 2025

Uh oh!

github-actions bot commented Sep 18, 2025

Uh oh!

MengqingCao commented Sep 18, 2025

Uh oh!

github-actions bot commented Sep 18, 2025

Uh oh!

shen-shanshan Sep 18, 2025

Uh oh!

Uh oh!

support cp&sp #2961

Are you sure you want to change the base?

support cp&sp #2961

Uh oh!

Conversation

LookAround0301 commented Sep 16, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

github-actions bot commented Sep 16, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

github-actions bot commented Sep 16, 2025

Uh oh!

github-actions bot commented Sep 18, 2025

Uh oh!

MengqingCao commented Sep 18, 2025

Uh oh!

github-actions bot commented Sep 18, 2025

Uh oh!

shen-shanshan Sep 18, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

LookAround0301 commented Sep 16, 2025 •

edited by github-actions bot

Loading