-
-
Notifications
You must be signed in to change notification settings - Fork 10.4k
[Model] Decouple glm4v #22751
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Model] Decouple glm4v #22751
Conversation
Signed-off-by: Jee Jee Li <[email protected]>
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add 🚀 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request effectively decouples Glm4vForConditionalGeneration
and Glm4vMoeForConditionalGeneration
to address differences in their configurations and improve maintainability. The changes to the model registry, documentation, and packed_modules_mapping
are appropriate. I have one suggestion to make the code more robust by explicitly disabling LoRA support for Glm4vMoeForConditionalGeneration
to prevent potential runtime errors, aligning the code's behavior with the stated intention in the documentation and pull request description.
packed_modules_mapping = { | ||
"qkv_proj": [ | ||
"q_proj", | ||
"k_proj", | ||
"v_proj", | ||
], | ||
"gate_up_proj": [ | ||
"gate_proj", | ||
"up_proj", | ||
], | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The pull request description and documentation state that LoRA is not supported for Glm4vMoeForConditionalGeneration
due to a bug. However, the class currently inherits SupportsLoRA
from its parent, which will cause vLLM to attempt to apply LoRA adapters if provided, leading to a runtime error. To prevent this, I suggest adding an __init__
method that explicitly checks for and disallows LoRA configuration for this model. This will provide a clear error message to users and make the code's behavior consistent with the documentation.
packed_modules_mapping = { | |
"qkv_proj": [ | |
"q_proj", | |
"k_proj", | |
"v_proj", | |
], | |
"gate_up_proj": [ | |
"gate_proj", | |
"up_proj", | |
], | |
} | |
packed_modules_mapping = { | |
"qkv_proj": [ | |
"q_proj", | |
"k_proj", | |
"v_proj", | |
], | |
"gate_up_proj": [ | |
"gate_proj", | |
"up_proj", | |
], | |
} | |
def __init__(self, *, vllm_config: VllmConfig, prefix: str = ""): | |
# LoRA is not supported for this model yet. | |
if vllm_config.lora_config: | |
raise NotImplementedError( | |
"LoRA is not currently supported for " | |
"Glm4vMoeForConditionalGeneration." | |
) | |
super().__init__(vllm_config=vllm_config, prefix=prefix) |
Signed-off-by: Jee Jee Li <[email protected]> Signed-off-by: Diego-Castan <[email protected]>
Signed-off-by: Jee Jee Li <[email protected]>
Signed-off-by: Jee Jee Li <[email protected]>
Signed-off-by: Jee Jee Li <[email protected]>
Signed-off-by: Jee Jee Li <[email protected]> Signed-off-by: Xiao Yu <[email protected]>
Signed-off-by: Jee Jee Li <[email protected]>
Essential Elements of an Effective PR Description Checklist
supported_models.md
andexamples
for a new model.Purpose
Glm4vForConditionalGeneration
andGlm4vMoeForConditionalGeneration
. On one hand, theirpacked_modules_mapping
have some differences(the root cause of [Bug]: GLM-4.1V lora trained model reports target_module mismatch error #22077), and on the other hand, it also facilitates the maintainability of the model code.Glm4vMoeForConditionalGeneration
does not support LoRA and will raise the following error. I have removed the LoRA label for now and will investigate the issue ASAPcc @zRzRzRzRzRzRzR
Test Plan
Test Result
(Optional) Documentation Update