Skip to content

Commit 1bbf332

Browse files
divakar-amdlk-chen
authored andcommitted
[rocm][V0] fix selection logic for custom PA in V0 (vllm-project#16426)
Signed-off-by: Divakar Verma <[email protected]>
1 parent 99d388a commit 1bbf332

File tree

1 file changed

+4
-1
lines changed

1 file changed

+4
-1
lines changed

vllm/platforms/rocm.py

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -109,8 +109,11 @@ def use_rocm_custom_paged_attention(qtype: torch.dtype, head_size: int,
109109
ON_MI250_MI300 = any(arch in GPU_ARCH for arch in ["gfx90a", "gfx942"])
110110

111111
# rocm custom page attention not support on navi (gfx1*)
112+
# custom paged attn always supported on V0. On V1, requires sliding window
113+
# disabled due to observed numerical discrepancy.
112114
return (ON_MI250_MI300 and not ON_NAVI
113-
and (sliding_window == 0 or sliding_window == (-1, -1))
115+
and (not envs.VLLM_USE_V1 or sliding_window == 0
116+
or sliding_window == (-1, -1))
114117
and (qtype == torch.half or qtype == torch.bfloat16)
115118
and (head_size == 64 or head_size == 128)
116119
and (block_size == 16 or block_size == 32)

0 commit comments

Comments
 (0)