Skip to content

Conversation

chenxi-yang
Copy link
Contributor

Summary: as title, generated with D80713197

Test Plan:
Run fused_moe on H100_80GB

Rollback Plan:

Reviewed By: zzh142857

Differential Revision: D80713433

@facebook-github-bot
Copy link

This pull request was exported from Phabricator. Differential Revision: D80713433

Summary:
Pull Request resolved: vllm-project#23443

as title, generated with D80713197

Test Plan:
Run fused_moe on H100_80GB

Rollback Plan:

Reviewed By: zzh142857

Differential Revision: D80713433
@facebook-github-bot
Copy link

This pull request was exported from Phabricator. Differential Revision: D80713433

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds new FP8 configurations for glm4.5v on H100_80GB GPUs with tensor parallelism of 2 and 4. It introduces a new environment variable VLLM_USE_FUSED_MOE_KERNEL_IN_COMPRESSED_QUANTIZATION to control the fused MoE kernel usage. The tensor parallelism logic in glm4_1v.py is refactored to align with vLLM's standard implementation. My review includes a suggestion to refactor duplicated code in compressed_tensors_moe.py to improve maintainability.

@DarkLight1337 DarkLight1337 enabled auto-merge (squash) August 23, 2025 00:57
@github-actions github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Aug 23, 2025
@DarkLight1337 DarkLight1337 merged commit 308fa28 into vllm-project:main Aug 23, 2025
49 checks passed
epwalsh pushed a commit to epwalsh/vllm that referenced this pull request Aug 28, 2025
xiao-llm pushed a commit to xiao-llm/vllm that referenced this pull request Aug 28, 2025
zhewenl pushed a commit to zhewenl/vllm that referenced this pull request Aug 28, 2025
mengxingkongzhouhan pushed a commit to mengxingkongzhouhan/vllm that referenced this pull request Aug 30, 2025
zhewenl pushed a commit to zhewenl/vllm that referenced this pull request Sep 3, 2025
ekagra-ranjan pushed a commit to ekagra-ranjan/vllm that referenced this pull request Sep 4, 2025
FeiDaLI pushed a commit to FeiDaLI/vllm that referenced this pull request Sep 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready ONLY add when PR is ready to merge/full CI is needed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants