-
-
Notifications
You must be signed in to change notification settings - Fork 10.1k
[Bugfix] Fix DeepEP config for DP4TP4 #23619
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bugfix] Fix DeepEP config for DP4TP4 #23619
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request fixes an assertion error in DeepEP by using the dispatcher count instead of the data-parallel size for the combine configuration. The change appears correct based on the issue description. However, I've identified a critical issue where the check for supported rank configurations remains inconsistent with the new value, which could lead to a runtime crash. I've provided a suggestion to correct this.
vllm/model_executor/layers/fused_moe/deepep_ht_prepare_finalize.py
Outdated
Show resolved
Hide resolved
cc @tlrmchlsmth |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to change _get_dispatch_config
too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like both of these should be passing in the ep_size
- could you update the PR to pass that in @minosfuture ?
@tlrmchlsmth I didn't hit the assertion failure for dispatch. But yea, let me update that and test. |
b59b7e0
to
f91bc9c
Compare
Signed-off-by: Ming Yang <[email protected]>
Signed-off-by: Ming Yang <[email protected]>
Signed-off-by: Ming Yang <[email protected]>
f91bc9c
to
805017e
Compare
Purpose
To fix the following assertion error, the rank count should be dispatcher count (EP count) instead of DP count.
Test Plan
Test DP4TP4EP16
Test Result
can run
Essential Elements of an Effective PR Description Checklist
supported_models.md
andexamples
for a new model.