DBO HT without cudagraph #113

yewentao256 · 2025-09-05T19:47:48Z

Purpose

Test Plan

Test Result

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

LucasWilkinson · 2025-09-05T19:59:12Z

vllm/model_executor/layers/fused_moe/modular_kernel.py

-            if dbo_enabled():
+            if isinstance(prepare_ret, tuple):
+                hook, receiver = prepare_ret
+            else:


how does this differ form the if not self.prepare_finalize.supports_async(): path?

I think for the self.prepare_finalize.prepare() path, receiver will be called first and then packed to (a1q, a1q_scale, expert_tokens_meta, _expert_topk_ids, _expert_topk_weights). So we don't need to update it?

I guess im curious why this needed since I though HT would go through the:

if self.shared_experts is not None: shared_output = self.shared_experts(a1) (a1q, a1q_scale, expert_tokens_meta, _expert_topk_ids, _expert_topk_weights) = self.prepare_finalize.prepare( a1, a1_scale, a2_scale, topk_weights, topk_ids, global_num_experts, expert_map, apply_router_weight_on_input, self.fused_experts.quant_config, )

path

Do you mean for HT, supports_async should be False instead of True?

def supports_async(self) -> bool: return True

supports_async equals to True currently will let us go the the branch using self.prepare_finalize.prepare_async(

then we need to be compatible of low latency, because it is returning

return (hook, lambda hook: self._receiver(hook, expert_x, expert_num_tokens, a1_scale, a1.dtype, quant_config))

LucasWilkinson · 2025-09-07T23:56:08Z

vllm/distributed/device_communicators/all2all.py

+            compute_sms = total_sms - self.num_sms
+            assert compute_sms > 0, "compute_sms must be greater than 0"
+            logger.info("Setting DeepGEMM num_sms to %d for dbo", compute_sms)
+            dg.set_num_sms(compute_sms)


can we restrict this to just when the batch is actually running DBO? or will this do that already?

I think we do it already? dbo_enabled() is one of the condition

yewentao256 added 4 commits September 4, 2025 12:28

use_ht

7077c02

fix interface bug

25c6a14

support dbo for HT

3f0f1e4

add deepgemm sms

278d727

yewentao256 requested review from youkaichao, robertgshaw2-redhat, mgoin, tlrmchlsmth and ProExpertProg as code owners September 5, 2025 19:47

LucasWilkinson reviewed Sep 5, 2025

View reviewed changes

LucasWilkinson reviewed Sep 7, 2025

View reviewed changes

yewentao256 requested review from njhill and alexm-redhat as code owners September 9, 2025 15:02

fix acc issue

ac8fbb7

yewentao256 force-pushed the wye-dbo-full-cudagraph-ht branch from b01acdd to ac8fbb7 Compare September 10, 2025 19:52

class var

7280bee

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

DBO HT without cudagraph #113

DBO HT without cudagraph #113

yewentao256 commented Sep 5, 2025 •

edited by github-actions bot

Loading

Uh oh!

LucasWilkinson Sep 5, 2025

Uh oh!

yewentao256 Sep 5, 2025

Uh oh!

LucasWilkinson Sep 7, 2025

Uh oh!

yewentao256 Sep 9, 2025

Uh oh!

LucasWilkinson Sep 7, 2025

Uh oh!

yewentao256 Sep 9, 2025

Uh oh!

Uh oh!

DBO HT without cudagraph #113

Are you sure you want to change the base?

DBO HT without cudagraph #113

Conversation

yewentao256 commented Sep 5, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

LucasWilkinson Sep 5, 2025

Choose a reason for hiding this comment

Uh oh!

yewentao256 Sep 5, 2025

Choose a reason for hiding this comment

Uh oh!

LucasWilkinson Sep 7, 2025

Choose a reason for hiding this comment

Uh oh!

yewentao256 Sep 9, 2025

Choose a reason for hiding this comment

Uh oh!

LucasWilkinson Sep 7, 2025

Choose a reason for hiding this comment

Uh oh!

yewentao256 Sep 9, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

yewentao256 commented Sep 5, 2025 •

edited by github-actions bot

Loading