-
-
Notifications
You must be signed in to change notification settings - Fork 10.2k
[Compile] Fix Compile Warning for w4a8_mm_entry.cu
#23660
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Compile] Fix Compile Warning for w4a8_mm_entry.cu
#23660
Conversation
Signed-off-by: yewentao256 <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request addresses a compilation warning in w4a8_mm_entry.cu
caused by a narrowing conversion from int64_t
to int
for the group_size
parameter. The solution implements a runtime check to validate that group_size
is within the representable range of an int
before casting it. This change is correct, safe, and effectively resolves the compiler warning. The updated code is clean and I have no further suggestions for improvement.
@mgoin CC |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks
Please merge from main to fix CI |
) Signed-off-by: yewentao256 <[email protected]> Co-authored-by: Luka Govedič <[email protected]>
* 'main' of https://github.com/845473182/vllm: (457 commits) [BugFix] Fix routed_scaling_factor double mul for dots1 and glm4 MoE models (vllm-project#24132) [Misc] Add check for dual_chunk_attention (vllm-project#24070) [Doc]: fix typos in Python comments (vllm-project#24115) [Doc]: fix typos in Python comments (vllm-project#24093) [Compile] Fix Compile Warning for `w4a8_mm_entry.cu` (vllm-project#23660) fix some typos (vllm-project#24071) [V1] Wrapper which plumbs request-level logits processors into vLLM batch-level logits processing (vllm-project#23656) Upgrade xgrammar to 0.1.23 (vllm-project#22988) Update release pipeline post PyTorch 2.8.0 update (vllm-project#24073) [XPU] Fix the bug of LoRA logits on the XPU platform (vllm-project#24081) [CI/Build] Disable SiluMul NVFP4 quant fusion tests (vllm-project#24121) [Bug] R1 Accuracy: Fix `routed_scaling_factor` Double Mul Issue (vllm-project#24119) [AMD][Kernel][Bugfix] Cast offsets tensor bn to tl.int64 to avoid GPU segfault (vllm-project#23692) [CI] Enable all hf transformers baselines in test_hybrid (vllm-project#23936) [Log] Only Print Profiler Results on Rank 0 (vllm-project#23370) Fix weights loading for Apertus (vllm-project#24100) [Metrics] Deprecate TPOT in favor of ITL (vllm-project#24110) [Bugfix] Fix packed_factor missing attribute error (vllm-project#23902) Run ruff format on a few files. (vllm-project#24075) [Bugfix] Fix transform_config parsing in Compressed Tensors (vllm-project#23945) ...
) Signed-off-by: yewentao256 <[email protected]> Co-authored-by: Luka Govedič <[email protected]> Signed-off-by: Shiyan Deng <[email protected]>
) Signed-off-by: yewentao256 <[email protected]> Co-authored-by: Luka Govedič <[email protected]>
) Signed-off-by: yewentao256 <[email protected]> Co-authored-by: Luka Govedič <[email protected]> Signed-off-by: LopezCastroRoberto <[email protected]>
) Signed-off-by: yewentao256 <[email protected]> Co-authored-by: Luka Govedič <[email protected]> Signed-off-by: bruceszchen <[email protected]>
) Signed-off-by: yewentao256 <[email protected]> Co-authored-by: Luka Govedič <[email protected]> Signed-off-by: bruceszchen <[email protected]>
Purpose
Fix warning for
Test
Now