Support `enable_gqa` and only support 4D Q, K, and V #2558

titaiwangms · 2025-09-11T22:42:41Z

Support enable_gqa
Align PyTorch setting to unsupport Q, K, and V when they are not 4D: https://github.com/pytorch/pytorch/blob/62843c14bbf694f5722fd6e1075da4792507fe42/torch/onnx/_internal/exporter/_torchlib/ops/nn.py#L131-L133

NOTE: torch.nn.functional.scaled_dot_product_attention actually supports 3D, and even Q-3D with K and V - 4D in op tests.

titaiwangms · 2025-09-11T22:45:12Z

tests/function_libs/torch_lib/ops_test_data.py

+        matcher=lambda sample: len(sample.input.shape) != 4
+        or len(sample.args[0].shape) != 4
+        or len(sample.args[1].shape) != 4,
+        reason="torch sdpa is expected to pass in 4d q, k, and v.",


@justinchuby @xadupre Let me know what you think on whether we should support only 4d QKV, or we should fully support whatever torch sdpa supports. Right now, it seems like QKV can have 3d or 4d or even q 3d and kv 4d in torch sdpa.

Depending on the ATen op? Does the nn function do preprocessing on the inputs before sending them to the kernel? We just need to support whatever the kernel supports

codecov · 2025-09-11T22:50:23Z

Codecov Report

❌ Patch coverage is 14.28571% with 24 lines in your changes missing coverage. Please review.
✅ Project coverage is 70.32%. Comparing base (647b22a) to head (e661531).
⚠️ Report is 6 commits behind head on main.

Files with missing lines	Patch %	Lines
onnxscript/function_libs/torch_lib/ops/nn.py	14.28%	22 Missing and 2 partials ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #2558      +/-   ##
==========================================
- Coverage   70.34%   70.32%   -0.03%     
==========================================
  Files         218      222       +4     
  Lines       26430    26645     +215     
  Branches     2647     2663      +16     
==========================================
+ Hits        18593    18738     +145     
- Misses       6934     6991      +57     
- Partials      903      916      +13

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

justinchuby

Thanks!

justinchuby · 2025-09-11T23:24:24Z

Could you also add these few lines

if dropout_p > 0.0:
    attn_weight, _ = op.Dropout(attn_weight, dropout_p)

as a micro optimization? Or we can do that separately

titaiwangms · 2025-09-11T23:45:27Z

Could you also add these few lines
if dropout_p > 0.0:
    attn_weight, _ = op.Dropout(attn_weight, dropout_p)
as a micro optimization? Or we can do that separately

it's already there:

onnxscript/onnxscript/function_libs/torch_lib/ops/nn.py

Line 2041 in 50d7e87

attn_weight, _ = op.Dropout(attn_weight, dropout_p)

Fixes #162258 Related to microsoft/onnxscript#2558 Pull Request resolved: #162771 Approved by: https://github.com/justinchuby

Fixes pytorch#162258 Related to microsoft/onnxscript#2558 Pull Request resolved: pytorch#162771 Approved by: https://github.com/justinchuby

support enable_gqa

e661531

titaiwangms requested review from justinchuby and xadupre September 11, 2025 22:42

github-project-automation bot added this to ONNX Script Review Board Sep 11, 2025

github-project-automation bot moved this to Todo in ONNX Script Review Board Sep 11, 2025

titaiwangms commented Sep 11, 2025

View reviewed changes

justinchuby approved these changes Sep 11, 2025

View reviewed changes

justinchuby added the module: torchlib Related to the torch/aten function lib in development label Sep 11, 2025

titaiwangms mentioned this pull request Sep 11, 2025

[ONNX] Support enable_gqa when dropout is non-zero pytorch/pytorch#162771

Closed

titaiwangms requested a review from gramalingam September 12, 2025 00:37

titaiwangms enabled auto-merge (squash) September 12, 2025 00:37

titaiwangms merged commit 8ed3521 into microsoft:main Sep 12, 2025
32 checks passed

github-project-automation bot moved this from Todo to Done in ONNX Script Review Board Sep 12, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support `enable_gqa` and only support 4D Q, K, and V #2558

Support `enable_gqa` and only support 4D Q, K, and V #2558

Uh oh!

titaiwangms commented Sep 11, 2025

Uh oh!

titaiwangms Sep 11, 2025

Uh oh!

justinchuby Sep 11, 2025

Uh oh!

codecov bot commented Sep 11, 2025 •

edited

Loading

Uh oh!

justinchuby left a comment

Uh oh!

justinchuby commented Sep 11, 2025

Uh oh!

titaiwangms commented Sep 11, 2025

Uh oh!

Uh oh!

Uh oh!

Support enable_gqa and only support 4D Q, K, and V #2558

Support enable_gqa and only support 4D Q, K, and V #2558

Uh oh!

Conversation

titaiwangms commented Sep 11, 2025

Uh oh!

titaiwangms Sep 11, 2025

Choose a reason for hiding this comment

Uh oh!

justinchuby Sep 11, 2025

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Sep 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

justinchuby left a comment

Choose a reason for hiding this comment

Uh oh!

justinchuby commented Sep 11, 2025

Uh oh!

titaiwangms commented Sep 11, 2025

Uh oh!

Uh oh!

Uh oh!

Support `enable_gqa` and only support 4D Q, K, and V #2558

Support `enable_gqa` and only support 4D Q, K, and V #2558

codecov bot commented Sep 11, 2025 •

edited

Loading