-
-
Notifications
You must be signed in to change notification settings - Fork 10.4k
[Bug] [Spec Decode] Fix model_initialization test and mismatch in aux_hidden_layers #24613
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: wwl2755 <[email protected]>
Signed-off-by: wwl2755 <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request effectively addresses two bugs related to speculative decoding tests. First, it corrects the model initialization to use the proper target model instead of the speculative model. Second, it resolves a shape mismatch for aux_hidden_layers
by introducing a mechanism to use the original number of layers from the model config during testing. The changes are logical and well-implemented. I have one suggestion to improve the robustness of how the layer count is determined in the test utilities.
num_layers = getattr(text_config, 'num_layers', 1) | ||
num_hidden_layers = getattr(text_config, 'num_hidden_layers', 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logic for determining num_layers
and num_hidden_layers
can be fragile. If a model config only defines num_hidden_layers
(which is common for many models like Llama and Qwen), num_layers
will default to 1. This could lead to issues if other parts of the code rely on config.num_layers
. A more robust approach would be to fetch the layer count from either attribute and ensure both num_layers
and num_hidden_layers
are consistent.
num_layers = getattr(text_config, 'num_layers', 1) | |
num_hidden_layers = getattr(text_config, 'num_hidden_layers', 1) | |
num_layers = getattr(text_config, "num_hidden_layers", getattr(text_config, "num_layers", 1)) | |
num_hidden_layers = num_layers |
@wwl2755 If I understand correctly - this is just to fix our CI setup right? |
I turned on ready label to see if it fixes the issue. |
Yes. It enables some tests to used the original num_layers instead of 1. All the modification is under |
Signed-off-by: wwl2755 <[email protected]>
Signed-off-by: Roger Wang <[email protected]>
Signed-off-by: Cyrus Leung <[email protected]>
…_hidden_layers (vllm-project#24613) Signed-off-by: wwl2755 <[email protected]> Signed-off-by: Roger Wang <[email protected]> Signed-off-by: Cyrus Leung <[email protected]> Co-authored-by: Roger Wang <[email protected]> Co-authored-by: Cyrus Leung <[email protected]>
…_hidden_layers (vllm-project#24613) Signed-off-by: wwl2755 <[email protected]> Signed-off-by: Roger Wang <[email protected]> Signed-off-by: Cyrus Leung <[email protected]> Co-authored-by: Roger Wang <[email protected]> Co-authored-by: Cyrus Leung <[email protected]>
…_hidden_layers (vllm-project#24613) Signed-off-by: wwl2755 <[email protected]> Signed-off-by: Roger Wang <[email protected]> Signed-off-by: Cyrus Leung <[email protected]> Co-authored-by: Roger Wang <[email protected]> Co-authored-by: Cyrus Leung <[email protected]>
Fix the bug of model_initialization. The model should be initializated by a target model which is different from the
speculative_model
.After changing to the correct target model, I faced another error which is because of the dfferent shape in
hidden_states + residual
Root cause is the
num_layers
is initialized to 1 in the test (https://github.com/vllm-project/vllm/blob/v0.10.2rc1/tests/models/utils.py#L387). However, the aux_hidden_layer is determine as(2, num_layers // 2, num_layers - 3)
. (https://github.com/vllm-project/vllm/blob/v0.10.2rc1/tests/models/utils.py#L387)So, before this PR, it will treat the layer-0 as the aux-layer and entering the code path of
hidden_states + residual
(where hidden_states is a tensor and residual is None).Solution: we set some of the eagle3 models to have the original number of layers.
cc: @DarkLight1337 @WoosukKwon @LiuXiaoxuanPKU @22quinn @benchislett
Tests
All passed.