Skip to content

Conversation

meiravgri
Copy link
Collaborator

@meiravgri meiravgri commented Aug 12, 2024

Background

In PR #477 we introduced distance functions leveraging AVX512FP16 intrinsic to enhance FLOAT16 distance calculations.
We observed that _mm512_reduce_add_ph incurs significant overhead, leading to a substantial degradation in distance calculation performance compared to alternative implementations. However, this overhead becomes less significant at higher dimensions. Therefore, AVX512FP16 instruction set was enabled only for dimensions exceeding a certain threshold.

Current PR

This PR adds the -mavx512vl compilation flag to mitigate the latency issues observed with _mm512_reduce_add_ph.
Explanation:
The reduce instruction is a complex instruction, translating into a sequence of instructions under the hood.
With the flag, the compiler uses instructions: 128/256-bit vaddph. Without the flag, the compiler uses instructions: vaddsh.
As demonstrated in the attached performance graph, the first option leads to improved performance across all dimensions.
image
image

Additional changes

Expand LOW_DIM spaces benchmarks range. Starts from dim = 55 instead of 100. Still satisfies a no-residual dimension for all types (at dim = 160)

With this flag the dimension limit to chose the advanced opt is not required.

Expanded low dim bm spaces range.
Copy link

codecov bot commented Aug 12, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 97.13%. Comparing base (d8e3b55) to head (38f9579).

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #522   +/-   ##
=======================================
  Coverage   97.13%   97.13%           
=======================================
  Files          94       94           
  Lines        4886     4886           
=======================================
  Hits         4746     4746           
  Misses        140      140           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@meiravgri meiravgri requested review from alonre24 and GuyAv46 August 13, 2024 05:24
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is this file and its functions completely new? Shouldn’t we have other functions to rename or remove?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think i fixed it

@meiravgri meiravgri requested a review from GuyAv46 August 13, 2024 09:04
@meiravgri meiravgri changed the title add AVX512VL flag to FP16 flag to enhance performance. [MOD-5894] Add AVX512VL flag to FP16 flag to enhance performance. Aug 13, 2024
@meiravgri meiravgri changed the title [MOD-5894] Add AVX512VL flag to FP16 flag to enhance performance. [MOD-7369] Add AVX512VL flag to FP16 flag to enhance performance. Aug 13, 2024
Copy link
Collaborator

@GuyAv46 GuyAv46 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome

@meiravgri meiravgri added this pull request to the merge queue Aug 13, 2024
Merged via the queue into main with commit 3d77d23 Aug 13, 2024
26 checks passed
@meiravgri meiravgri deleted the meiravg_add_VL_flag_to_fp16 branch August 13, 2024 11:39
github-actions bot pushed a commit that referenced this pull request Aug 13, 2024
* add AVX512VL flag to FP16 flag to enhance performance.

With this flag the dimension limit to chose the advanced opt is not required.

Expanded low dim bm spaces range.

* add VL to building with prompt in compilation

* remove old file

(cherry picked from commit 3d77d23)
Copy link

Successfully created backport PR for 0.8:

meiravgri added a commit that referenced this pull request Aug 13, 2024
…e. (#524)

[MOD-7369] Add AVX512VL flag to FP16 flag to enhance performance. (#522)

* add AVX512VL flag to FP16 flag to enhance performance.

With this flag the dimension limit to chose the advanced opt is not required.

Expanded low dim bm spaces range.

* add VL to building with prompt in compilation

* remove old file

(cherry picked from commit 3d77d23)

Co-authored-by: meiravgri <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants