sort: Make use of ExtendedBigDecimal in -g sorting, then attempt to recover some performance #8062
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
See #8031.
This fixes GNUTest sort-float, and actually improves a bit on GNU coreutils behaviour as we now use arbitrary precision to sort floats.
This comes with significant performance cost, so I tried to recover some of the performance by optimizing parsing, but couldn't get back to original runtime. We're still much faster than GNU sort.
./sort-main
is currentupstream/main
,./sort-baseline
is after just the first commit, andtarget/release/sort
is after optimizations.sort
is GNU coreutils:test_sort: Add one more test checking arbitrary precision handling
test_sort: Add more sort use cases
test_g_float comes from GNU test, the other one is manually crafted.
docs/src/extensions: Sort uses arbitrary precision decimal numbers
uucore: num_parser: Optimize bigdecimal create when exponent is 0
Makes creating float number without an exponent part quite a bit
faster. Saves about 9% speed in sort -g.
uucore: num_parser: Optimize parse_digits_count
parse_digits_count is a significant hotspot in parsing code.
In particular, any add/mul operation on BigUint is fairly slow,
so it's better to accumulate digits in a u64, then add them
to the resulting BigUint.
Saves about 15-20% performance in
sort -g
.uucore: num_parser: Improve scale conversion to i64
It turns out repeatedly calling i64::MAX.into() and i64::MIN.into()
is actually very expensive. Just do the conversion first, and if
it fails, we know why.
Sadly there is still a conversion happening under the hood in
-exponent + scale
, but that'd need to be fixed in Bigint.Improves sort -g performance by ~5%.
sort: Make use of ExtendedBigDecimal in -g sorting
This provides better precision than f64, which we need.
Fixed #8031.