-
Notifications
You must be signed in to change notification settings - Fork 21
[MOD-7369] Add AVX512VL flag to FP16 flag to enhance performance. #522
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
With this flag the dimension limit to chose the advanced opt is not required. Expanded low dim bm spaces range.
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #522 +/- ##
=======================================
Coverage 97.13% 97.13%
=======================================
Files 94 94
Lines 4886 4886
=======================================
Hits 4746 4746
Misses 140 140 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is this file and its functions completely new? Shouldn’t we have other functions to rename or remove?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think i fixed it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome
* add AVX512VL flag to FP16 flag to enhance performance. With this flag the dimension limit to chose the advanced opt is not required. Expanded low dim bm spaces range. * add VL to building with prompt in compilation * remove old file (cherry picked from commit 3d77d23)
Successfully created backport PR for |
…e. (#524) [MOD-7369] Add AVX512VL flag to FP16 flag to enhance performance. (#522) * add AVX512VL flag to FP16 flag to enhance performance. With this flag the dimension limit to chose the advanced opt is not required. Expanded low dim bm spaces range. * add VL to building with prompt in compilation * remove old file (cherry picked from commit 3d77d23) Co-authored-by: meiravgri <[email protected]>
Background
In PR #477 we introduced distance functions leveraging
AVX512FP16
intrinsic to enhanceFLOAT16
distance calculations.We observed that
_mm512_reduce_add_ph
incurs significant overhead, leading to a substantial degradation in distance calculation performance compared to alternative implementations. However, this overhead becomes less significant at higher dimensions. Therefore,AVX512FP16
instruction set was enabled only for dimensions exceeding a certain threshold.Current PR
This PR adds the


-mavx512vl
compilation flag to mitigate the latency issues observed with_mm512_reduce_add_ph
.Explanation:
The reduce instruction is a complex instruction, translating into a sequence of instructions under the hood.
With the flag, the compiler uses instructions: 128/256-bit
vaddph
. Without the flag, the compiler uses instructions:vaddsh
.As demonstrated in the attached performance graph, the first option leads to improved performance across all dimensions.
Additional changes
Expand LOW_DIM spaces benchmarks range. Starts from dim = 55 instead of 100. Still satisfies a no-residual dimension for all types (at dim = 160)