Remove batch padding on ROCm #451

gshtras · 2025-02-26T21:28:18Z

A subset of changes from the pending upstream vllm-project#10836
We want to have it to unblock FP8 skinny GEMMs on non-llama models (Paged attention output not in FP8)

Remove batch padding on ROCm

3196f8c

gshtras requested a review from dllehr-amd February 26, 2025 21:28

gshtras requested review from charlifu, mawong-amd, shajrawi, maleksan85 and sunway513 as code owners February 26, 2025 21:28

shajrawi approved these changes Feb 26, 2025

View reviewed changes

gshtras merged commit f932181 into main Feb 26, 2025
3 of 5 checks passed

gshtras deleted the gshtras-patch-1 branch February 26, 2025 21:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Remove batch padding on ROCm #451

Remove batch padding on ROCm #451

Uh oh!

gshtras commented Feb 26, 2025 •

edited by github-actions bot

Loading

Uh oh!

Uh oh!

Uh oh!

Remove batch padding on ROCm #451

Remove batch padding on ROCm #451

Uh oh!

Conversation

gshtras commented Feb 26, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gshtras commented Feb 26, 2025 •

edited by github-actions bot

Loading