-
Notifications
You must be signed in to change notification settings - Fork 24
AVX2/AArch64: Add native poly_pointwise_montgomery
#346
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Graviton4
Benchmark suite | Current: 1fcd573 | Previous: 2c8f312 | Ratio |
---|---|---|---|
ML-DSA-44 keypair |
77278 cycles |
77359 cycles |
1.00 |
ML-DSA-44 sign |
238351 cycles |
238702 cycles |
1.00 |
ML-DSA-44 verify |
84670 cycles |
84808 cycles |
1.00 |
ML-DSA-65 keypair |
137276 cycles |
137174 cycles |
1.00 |
ML-DSA-65 sign |
385184 cycles |
386156 cycles |
1.00 |
ML-DSA-65 verify |
137648 cycles |
137679 cycles |
1.00 |
ML-DSA-87 keypair |
221254 cycles |
221386 cycles |
1.00 |
ML-DSA-87 sign |
493976 cycles |
493662 cycles |
1.00 |
ML-DSA-87 verify |
224032 cycles |
224278 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Graviton4 (no-opt)
Benchmark suite | Current: 1fcd573 | Previous: 2c8f312 | Ratio |
---|---|---|---|
ML-DSA-44 keypair |
131615 cycles |
131443 cycles |
1.00 |
ML-DSA-44 sign |
455861 cycles |
455906 cycles |
1.00 |
ML-DSA-44 verify |
151182 cycles |
142900 cycles |
1.06 |
ML-DSA-65 keypair |
223983 cycles |
225008 cycles |
1.00 |
ML-DSA-65 sign |
733346 cycles |
733460 cycles |
1.00 |
ML-DSA-65 verify |
227150 cycles |
226945 cycles |
1.00 |
ML-DSA-87 keypair |
370115 cycles |
370197 cycles |
1.00 |
ML-DSA-87 sign |
938333 cycles |
938660 cycles |
1.00 |
ML-DSA-87 verify |
376843 cycles |
377254 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Graviton2
Benchmark suite | Current: 1fcd573 | Previous: 2c8f312 | Ratio |
---|---|---|---|
ML-DSA-44 keypair |
128740 cycles |
129013 cycles |
1.00 |
ML-DSA-44 sign |
418832 cycles |
420050 cycles |
1.00 |
ML-DSA-44 verify |
142390 cycles |
142767 cycles |
1.00 |
ML-DSA-65 keypair |
239992 cycles |
240308 cycles |
1.00 |
ML-DSA-65 sign |
691430 cycles |
695323 cycles |
0.99 |
ML-DSA-65 verify |
230845 cycles |
231294 cycles |
1.00 |
ML-DSA-87 keypair |
362280 cycles |
362446 cycles |
1.00 |
ML-DSA-87 sign |
874006 cycles |
880689 cycles |
0.99 |
ML-DSA-87 verify |
375607 cycles |
375553 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Graviton3
Benchmark suite | Current: 1fcd573 | Previous: 2c8f312 | Ratio |
---|---|---|---|
ML-DSA-44 keypair |
82280 cycles |
82277 cycles |
1.00 |
ML-DSA-44 sign |
253217 cycles |
254105 cycles |
1.00 |
ML-DSA-44 verify |
92187 cycles |
92276 cycles |
1.00 |
ML-DSA-65 keypair |
148982 cycles |
149066 cycles |
1.00 |
ML-DSA-65 sign |
415976 cycles |
416387 cycles |
1.00 |
ML-DSA-65 verify |
147655 cycles |
147774 cycles |
1.00 |
ML-DSA-87 keypair |
232365 cycles |
232399 cycles |
1.00 |
ML-DSA-87 sign |
526443 cycles |
526369 cycles |
1.00 |
ML-DSA-87 verify |
239330 cycles |
239305 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Graviton2 (no-opt)
Benchmark suite | Current: 1fcd573 | Previous: 2c8f312 | Ratio |
---|---|---|---|
ML-DSA-44 keypair |
209562 cycles |
210419 cycles |
1.00 |
ML-DSA-44 sign |
722381 cycles |
722578 cycles |
1.00 |
ML-DSA-44 verify |
233528 cycles |
238793 cycles |
0.98 |
ML-DSA-65 keypair |
376334 cycles |
377835 cycles |
1.00 |
ML-DSA-65 sign |
1186815 cycles |
1187372 cycles |
1.00 |
ML-DSA-65 verify |
370207 cycles |
371286 cycles |
1.00 |
ML-DSA-87 keypair |
597405 cycles |
595539 cycles |
1.00 |
ML-DSA-87 sign |
1516820 cycles |
1517135 cycles |
1.00 |
ML-DSA-87 verify |
613145 cycles |
613408 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Graviton3 (no-opt)
Benchmark suite | Current: 1fcd573 | Previous: 2c8f312 | Ratio |
---|---|---|---|
ML-DSA-44 keypair |
136224 cycles |
136206 cycles |
1.00 |
ML-DSA-44 sign |
451156 cycles |
451058 cycles |
1.00 |
ML-DSA-44 verify |
147230 cycles |
147189 cycles |
1.00 |
ML-DSA-65 keypair |
239506 cycles |
239220 cycles |
1.00 |
ML-DSA-65 sign |
733037 cycles |
732792 cycles |
1.00 |
ML-DSA-65 verify |
237567 cycles |
237444 cycles |
1.00 |
ML-DSA-87 keypair |
390260 cycles |
390239 cycles |
1.00 |
ML-DSA-87 sign |
947572 cycles |
947960 cycles |
1.00 |
ML-DSA-87 verify |
397074 cycles |
396989 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mac Mini (M1, 2020) benchmarks (opt)
Benchmark suite | Current: 1fcd573 | Previous: 2c8f312 | Ratio |
---|---|---|---|
ML-DSA-44 keypair |
55129 cycles |
55117 cycles |
1.00 |
ML-DSA-44 sign |
170291 cycles |
181260 cycles |
0.94 |
ML-DSA-44 verify |
68909 cycles |
70286 cycles |
0.98 |
ML-DSA-65 keypair |
98085 cycles |
98148 cycles |
1.00 |
ML-DSA-65 sign |
271242 cycles |
288768 cycles |
0.94 |
ML-DSA-65 verify |
108785 cycles |
110645 cycles |
0.98 |
ML-DSA-87 keypair |
153013 cycles |
152964 cycles |
1.00 |
ML-DSA-87 sign |
336690 cycles |
357460 cycles |
0.94 |
ML-DSA-87 verify |
170662 cycles |
173366 cycles |
0.98 |
This comment was automatically generated by workflow using github-action-benchmark.
8b6bd85
to
dc976dc
Compare
Signed-off-by: Matthias J. Kannwischer <[email protected]>
dc976dc
to
a4856a7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Intel Xeon 4th gen (c7i)
Benchmark suite | Current: 1fcd573 | Previous: 2c8f312 | Ratio |
---|---|---|---|
ML-DSA-44 keypair |
46639 cycles |
46293 cycles |
1.01 |
ML-DSA-44 sign |
141354 cycles |
143051 cycles |
0.99 |
ML-DSA-44 verify |
52616 cycles |
52701 cycles |
1.00 |
ML-DSA-65 keypair |
82758 cycles |
83834 cycles |
0.99 |
ML-DSA-65 sign |
234561 cycles |
239453 cycles |
0.98 |
ML-DSA-65 verify |
82524 cycles |
83668 cycles |
0.99 |
ML-DSA-87 keypair |
125143 cycles |
125409 cycles |
1.00 |
ML-DSA-87 sign |
283577 cycles |
287343 cycles |
0.99 |
ML-DSA-87 verify |
126906 cycles |
127995 cycles |
0.99 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Intel Xeon 4th gen (c7i) (no-opt)
Benchmark suite | Current: 1fcd573 | Previous: 2c8f312 | Ratio |
---|---|---|---|
ML-DSA-44 keypair |
95700 cycles |
95631 cycles |
1.00 |
ML-DSA-44 sign |
322282 cycles |
323360 cycles |
1.00 |
ML-DSA-44 verify |
102390 cycles |
102372 cycles |
1.00 |
ML-DSA-65 keypair |
163967 cycles |
163735 cycles |
1.00 |
ML-DSA-65 sign |
528300 cycles |
528058 cycles |
1.00 |
ML-DSA-65 verify |
162999 cycles |
162683 cycles |
1.00 |
ML-DSA-87 keypair |
266517 cycles |
265734 cycles |
1.00 |
ML-DSA-87 sign |
670862 cycles |
667701 cycles |
1.00 |
ML-DSA-87 verify |
271269 cycles |
270607 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Intel Xeon 3rd gen (c6i)
Benchmark suite | Current: 1fcd573 | Previous: 2c8f312 | Ratio |
---|---|---|---|
ML-DSA-44 keypair |
76476 cycles |
76456 cycles |
1.00 |
ML-DSA-44 sign |
216515 cycles |
220291 cycles |
0.98 |
ML-DSA-44 verify |
85019 cycles |
85557 cycles |
0.99 |
ML-DSA-65 keypair |
136112 cycles |
135777 cycles |
1.00 |
ML-DSA-65 sign |
357643 cycles |
363388 cycles |
0.98 |
ML-DSA-65 verify |
135074 cycles |
135930 cycles |
0.99 |
ML-DSA-87 keypair |
205390 cycles |
205560 cycles |
1.00 |
ML-DSA-87 sign |
434173 cycles |
441863 cycles |
0.98 |
ML-DSA-87 verify |
205850 cycles |
206797 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AMD EPYC 3rd gen (c6a)
Benchmark suite | Current: 1fcd573 | Previous: 2c8f312 | Ratio |
---|---|---|---|
ML-DSA-44 keypair |
81207 cycles |
80849 cycles |
1.00 |
ML-DSA-44 sign |
214285 cycles |
225721 cycles |
0.95 |
ML-DSA-44 verify |
86200 cycles |
87762 cycles |
0.98 |
ML-DSA-65 keypair |
142258 cycles |
142136 cycles |
1.00 |
ML-DSA-65 sign |
347838 cycles |
362871 cycles |
0.96 |
ML-DSA-65 verify |
137115 cycles |
139820 cycles |
0.98 |
ML-DSA-87 keypair |
230776 cycles |
230443 cycles |
1.00 |
ML-DSA-87 sign |
454061 cycles |
471376 cycles |
0.96 |
ML-DSA-87 verify |
225608 cycles |
228079 cycles |
0.99 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Intel Xeon 3rd gen (c6i) (no-opt)
Benchmark suite | Current: 1fcd573 | Previous: 2c8f312 | Ratio |
---|---|---|---|
ML-DSA-44 keypair |
153986 cycles |
154048 cycles |
1.00 |
ML-DSA-44 sign |
516623 cycles |
516877 cycles |
1.00 |
ML-DSA-44 verify |
165341 cycles |
165471 cycles |
1.00 |
ML-DSA-65 keypair |
260795 cycles |
260406 cycles |
1.00 |
ML-DSA-65 sign |
833617 cycles |
833190 cycles |
1.00 |
ML-DSA-65 verify |
264693 cycles |
264810 cycles |
1.00 |
ML-DSA-87 keypair |
433114 cycles |
438000 cycles |
0.99 |
ML-DSA-87 sign |
1070706 cycles |
1080273 cycles |
0.99 |
ML-DSA-87 verify |
437811 cycles |
442889 cycles |
0.99 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AMD EPYC 4th gen (c7a)
Benchmark suite | Current: 1fcd573 | Previous: 2c8f312 | Ratio |
---|---|---|---|
ML-DSA-44 keypair |
56105 cycles |
58037 cycles |
0.97 |
ML-DSA-44 sign |
160387 cycles |
169809 cycles |
0.94 |
ML-DSA-44 verify |
64308 cycles |
65688 cycles |
0.98 |
ML-DSA-65 keypair |
103440 cycles |
98195 cycles |
1.05 |
ML-DSA-65 sign |
263482 cycles |
257411 cycles |
1.02 |
ML-DSA-65 verify |
100576 cycles |
98329 cycles |
1.02 |
ML-DSA-87 keypair |
155197 cycles |
149538 cycles |
1.04 |
ML-DSA-87 sign |
315363 cycles |
313605 cycles |
1.01 |
ML-DSA-87 verify |
155592 cycles |
150685 cycles |
1.03 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AMD EPYC 3rd gen (c6a) (no-opt)
Benchmark suite | Current: 1fcd573 | Previous: 2c8f312 | Ratio |
---|---|---|---|
ML-DSA-44 keypair |
134008 cycles |
134104 cycles |
1.00 |
ML-DSA-44 sign |
508048 cycles |
508236 cycles |
1.00 |
ML-DSA-44 verify |
149202 cycles |
149075 cycles |
1.00 |
ML-DSA-65 keypair |
223397 cycles |
223683 cycles |
1.00 |
ML-DSA-65 sign |
814824 cycles |
812225 cycles |
1.00 |
ML-DSA-65 verify |
233403 cycles |
233252 cycles |
1.00 |
ML-DSA-87 keypair |
368395 cycles |
368108 cycles |
1.00 |
ML-DSA-87 sign |
1027273 cycles |
1025498 cycles |
1.00 |
ML-DSA-87 verify |
380965 cycles |
380626 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AMD EPYC 4th gen (c7a) (no-opt)
Benchmark suite | Current: 1fcd573 | Previous: 2c8f312 | Ratio |
---|---|---|---|
ML-DSA-44 keypair |
118302 cycles |
119172 cycles |
0.99 |
ML-DSA-44 sign |
417091 cycles |
418564 cycles |
1.00 |
ML-DSA-44 verify |
131102 cycles |
131691 cycles |
1.00 |
ML-DSA-65 keypair |
200376 cycles |
200755 cycles |
1.00 |
ML-DSA-65 sign |
668860 cycles |
676135 cycles |
0.99 |
ML-DSA-65 verify |
204699 cycles |
206045 cycles |
0.99 |
ML-DSA-87 keypair |
332465 cycles |
335161 cycles |
0.99 |
ML-DSA-87 sign |
864265 cycles |
869596 cycles |
0.99 |
ML-DSA-87 verify |
340400 cycles |
341973 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mac Mini (M1, 2020) benchmarks (no-opt)
Benchmark suite | Current: 1fcd573 | Previous: 2c8f312 | Ratio |
---|---|---|---|
ML-DSA-44 keypair |
112349 cycles |
112348 cycles |
1.00 |
ML-DSA-44 sign |
409302 cycles |
409341 cycles |
1.00 |
ML-DSA-44 verify |
130067 cycles |
130058 cycles |
1.00 |
ML-DSA-65 keypair |
192847 cycles |
192877 cycles |
1.00 |
ML-DSA-65 sign |
658811 cycles |
658871 cycles |
1.00 |
ML-DSA-65 verify |
207224 cycles |
207182 cycles |
1.00 |
ML-DSA-87 keypair |
316032 cycles |
316055 cycles |
1.00 |
ML-DSA-87 sign |
832943 cycles |
832950 cycles |
1.00 |
ML-DSA-87 verify |
339082 cycles |
339076 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Arm Cortex-A55 (Snapdragon 888) benchmarks (opt)
Benchmark suite | Current: 1fcd573 | Previous: 2c8f312 | Ratio |
---|---|---|---|
ML-DSA-44 keypair |
317762 cycles |
317096 cycles |
1.00 |
ML-DSA-44 sign |
997799 cycles |
1018044 cycles |
0.98 |
ML-DSA-44 verify |
341059 cycles |
342860 cycles |
0.99 |
ML-DSA-65 keypair |
555083 cycles |
555550 cycles |
1.00 |
ML-DSA-65 sign |
1660502 cycles |
1678586 cycles |
0.99 |
ML-DSA-65 verify |
543467 cycles |
547237 cycles |
0.99 |
ML-DSA-87 keypair |
925456 cycles |
918592 cycles |
1.01 |
ML-DSA-87 sign |
2264430 cycles |
2226269 cycles |
1.02 |
ML-DSA-87 verify |
920921 cycles |
923214 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Arm Cortex-A55 (Snapdragon 888) benchmarks (no-opt)
Benchmark suite | Current: 1fcd573 | Previous: 2c8f312 | Ratio |
---|---|---|---|
ML-DSA-44 keypair |
452588 cycles |
451876 cycles |
1.00 |
ML-DSA-44 sign |
2008448 cycles |
2006209 cycles |
1.00 |
ML-DSA-44 verify |
526344 cycles |
525659 cycles |
1.00 |
ML-DSA-65 keypair |
761801 cycles |
759548 cycles |
1.00 |
ML-DSA-65 sign |
3334109 cycles |
3318976 cycles |
1.00 |
ML-DSA-65 verify |
819837 cycles |
817753 cycles |
1.00 |
ML-DSA-87 keypair |
1228908 cycles |
1223157 cycles |
1.00 |
ML-DSA-87 sign |
4143200 cycles |
4111725 cycles |
1.01 |
ML-DSA-87 verify |
1316509 cycles |
1312412 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Arm Cortex-A76 (Raspberry Pi 5) benchmarks (opt)
Benchmark suite | Current: 1fcd573 | Previous: 2c8f312 | Ratio |
---|---|---|---|
ML-DSA-44 keypair |
128536 cycles |
128662 cycles |
1.00 |
ML-DSA-44 sign |
418466 cycles |
419335 cycles |
1.00 |
ML-DSA-44 verify |
142201 cycles |
142486 cycles |
1.00 |
ML-DSA-65 keypair |
240068 cycles |
240198 cycles |
1.00 |
ML-DSA-65 sign |
690896 cycles |
694792 cycles |
0.99 |
ML-DSA-65 verify |
230790 cycles |
231005 cycles |
1.00 |
ML-DSA-87 keypair |
361956 cycles |
361975 cycles |
1.00 |
ML-DSA-87 sign |
872989 cycles |
879257 cycles |
0.99 |
ML-DSA-87 verify |
375760 cycles |
375200 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SpacemiT K1 8 (Banana Pi F3) benchmarks (no-opt)
Benchmark suite | Current: 1fcd573 | Previous: 2c8f312 | Ratio |
---|---|---|---|
ML-DSA-44 keypair |
939171 cycles |
937991 cycles |
1.00 |
ML-DSA-44 sign |
4325879 cycles |
4322008 cycles |
1.00 |
ML-DSA-44 verify |
1071821 cycles |
1070846 cycles |
1.00 |
ML-DSA-65 keypair |
1556170 cycles |
1555235 cycles |
1.00 |
ML-DSA-65 sign |
7143723 cycles |
7134424 cycles |
1.00 |
ML-DSA-65 verify |
1692624 cycles |
1691055 cycles |
1.00 |
ML-DSA-87 keypair |
2525632 cycles |
2522308 cycles |
1.00 |
ML-DSA-87 sign |
8747724 cycles |
8729409 cycles |
1.00 |
ML-DSA-87 verify |
2710156 cycles |
2706680 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Arm Cortex-A76 (Raspberry Pi 5) benchmarks (no-opt)
Benchmark suite | Current: 1fcd573 | Previous: 2c8f312 | Ratio |
---|---|---|---|
ML-DSA-44 keypair |
209655 cycles |
209811 cycles |
1.00 |
ML-DSA-44 sign |
722179 cycles |
721462 cycles |
1.00 |
ML-DSA-44 verify |
228283 cycles |
228512 cycles |
1.00 |
ML-DSA-65 keypair |
375596 cycles |
375810 cycles |
1.00 |
ML-DSA-65 sign |
1186352 cycles |
1185979 cycles |
1.00 |
ML-DSA-65 verify |
370168 cycles |
370610 cycles |
1.00 |
ML-DSA-87 keypair |
596584 cycles |
595808 cycles |
1.00 |
ML-DSA-87 sign |
1513903 cycles |
1514568 cycles |
1.00 |
ML-DSA-87 verify |
612620 cycles |
613735 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Arm Cortex-A72 (Raspberry Pi 4) benchmarks (opt)
Benchmark suite | Current: 1fcd573 | Previous: 2c8f312 | Ratio |
---|---|---|---|
ML-DSA-44 keypair |
251270 cycles |
237197 cycles |
1.06 |
ML-DSA-44 sign |
749767 cycles |
703842 cycles |
1.07 |
ML-DSA-44 verify |
259927 cycles |
240763 cycles |
1.08 |
ML-DSA-65 keypair |
476884 cycles |
463657 cycles |
1.03 |
ML-DSA-65 sign |
1216737 cycles |
1186377 cycles |
1.03 |
ML-DSA-65 verify |
435656 cycles |
418257 cycles |
1.04 |
ML-DSA-87 keypair |
691818 cycles |
696260 cycles |
0.99 |
ML-DSA-87 sign |
1535844 cycles |
1539476 cycles |
1.00 |
ML-DSA-87 verify |
693012 cycles |
690657 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⚠️ Performance Alert ⚠️
Possible performance regression was detected for benchmark 'Arm Cortex-A72 (Raspberry Pi 4) benchmarks (opt)'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.03
.
Benchmark suite | Current: 1fcd573 | Previous: 2c8f312 | Ratio |
---|---|---|---|
ML-DSA-44 keypair |
251270 cycles |
237197 cycles |
1.06 |
ML-DSA-44 sign |
749767 cycles |
703842 cycles |
1.07 |
ML-DSA-44 verify |
259927 cycles |
240763 cycles |
1.08 |
ML-DSA-65 verify |
435656 cycles |
418257 cycles |
1.04 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Arm Cortex-A72 (Raspberry Pi 4) benchmarks (no-opt)
Benchmark suite | Current: 1fcd573 | Previous: 2c8f312 | Ratio |
---|---|---|---|
ML-DSA-44 keypair |
291947 cycles |
310373 cycles |
0.94 |
ML-DSA-44 sign |
1080188 cycles |
1132919 cycles |
0.95 |
ML-DSA-44 verify |
319505 cycles |
332037 cycles |
0.96 |
ML-DSA-65 keypair |
546967 cycles |
559733 cycles |
0.98 |
ML-DSA-65 sign |
1793387 cycles |
1850373 cycles |
0.97 |
ML-DSA-65 verify |
528504 cycles |
536497 cycles |
0.99 |
ML-DSA-87 keypair |
834990 cycles |
842080 cycles |
0.99 |
ML-DSA-87 sign |
2260558 cycles |
2348501 cycles |
0.96 |
ML-DSA-87 verify |
864553 cycles |
880866 cycles |
0.98 |
This comment was automatically generated by workflow using github-action-benchmark.
Signed-off-by: Matthias J. Kannwischer <[email protected]>
Signed-off-by: Matthias J. Kannwischer <[email protected]>
a4856a7
to
1fcd573
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⚠️ Performance Alert ⚠️
Possible performance regression was detected for benchmark 'AMD EPYC 4th gen (c7a)'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.03
.
Benchmark suite | Current: 1fcd573 | Previous: 2c8f312 | Ratio |
---|---|---|---|
ML-DSA-65 keypair |
103440 cycles |
98195 cycles |
1.05 |
ML-DSA-87 keypair |
155197 cycles |
149538 cycles |
1.04 |
ML-DSA-87 verify |
155592 cycles |
150685 cycles |
1.03 |
This comment was automatically generated by workflow using github-action-benchmark.
poly_pointwise_montgomery
assembly #336poly_pointwise_montgomery
assembly #337This hoists out the native pointwise multiplication from #334 and also adds similar code for AVX2.
Note that it does not yet modify the matrix-vector multiplication - that one is hence not using this code yet.
That will be done in a follow-up PR which will make it easier to understand the performance impact.