Skip to content

Conversation

tgrez
Copy link
Contributor

@tgrez tgrez commented Jun 25, 2025

Fixes #8202

I'm not sure if this fix is complete, because I couldn't reproduce the original behavior of uutils impl with GNU od, so I don't know in which situations the previous implementation was valid (I assume there is/was a reason for implementing it the way it was implemented).

Regardless of the above, this fixes the reported issue. It still lacks localization support, but it works correctly for LC_ALL=C as well as LC_ALL=en_US.UTF-8.

@tgrez tgrez force-pushed the main branch 3 times, most recently from 50ecff0 to 94feb2d Compare June 25, 2025 15:59
Copy link

GNU testsuite comparison:

Skip an intermittent issue tests/misc/tee (fails in this run but passes in the 'main' branch)
Skip an intermittent issue tests/timeout/timeout (fails in this run but passes in the 'main' branch)

@RenjiSann
Copy link
Collaborator

It still lacks localization support

I'm not familiar with od, do you have an example of its behavior changing depending on the locale ?

@tgrez
Copy link
Contributor Author

tgrez commented Jun 26, 2025

It still lacks localization support

I'm not familiar with od, do you have an example of its behavior changing depending on the locale ?

I'd say the difference could come from different encodings, eg:

$ echo -n 'ą' | iconv -f UTF-8 -t ISO-8859-2 | od -t c
0000000 261
0000001
$ echo -n 'ą' | od -t c
0000000 304 205
0000002

@tgrez
Copy link
Contributor Author

tgrez commented Jun 26, 2025

And also:

$ echo -n 'ą' | LC_ALL=pl_PL.ISO-8859-2 od -t c
0000000   ? 205
0000002
$ echo -n 'ą' | LC_ALL=C od -t c
0000000 304 205
0000002

@RenjiSann
Copy link
Collaborator

I'd say the difference could come from different encodings, eg:

$ echo -n 'ą' | iconv -f UTF-8 -t ISO-8859-2 | od -t c
0000000 261
0000001
$ echo -n 'ą' | od -t c
0000000 304 205
0000002

Unless I'm mistaken, in this example, it's the input given to od that changes, so I don't think it changes anything for od in itself.

@RenjiSann
Copy link
Collaborator

I think this fix is good though, I'll happily merge it once the typo is addressed ^^'

@tgrez
Copy link
Contributor Author

tgrez commented Jun 26, 2025

Unless I'm mistaken, in this example, it's the input given to od that changes, so I don't think it changes anything for od in itself.

Ah, yes, you're correct.

I'll happily merge it once the typo is addressed ^^'

Sorry for the confusion, which typo? :)

@RenjiSann
Copy link
Collaborator

RenjiSann commented Jun 26, 2025

Sorry for the confusion, which typo? :)

C.f. my comment #8268 (comment)

@tgrez
Copy link
Contributor Author

tgrez commented Jun 26, 2025

Happy to adjust anything, including typos.

Sorry, but I can't see this comment, after I click the link I'm redirected only to this PR, adding some screenshots:

Screenshot 2025-06-26 at 22 16 27 Screenshot 2025-06-26 at 22 16 59

@@ -51,33 +50,13 @@ fn format_item_c(bytes: &[u8]) -> String {
let b = bytes[0];

if b & 0x80 == 0x00 {
// // ASCII byte (0xxxxxxx)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// // ASCII byte (0xxxxxxx)
// ASCII byte (0xxxxxxx)

@RenjiSann
Copy link
Collaborator

That's absolutely my bad, I didn't see this was still pending
Sorry, thanks for noticing 😅

@tgrez
Copy link
Contributor Author

tgrez commented Jun 27, 2025

That's absolutely my bad, I didn't see this was still pending Sorry, thanks for noticing 😅

No worries! And thank you for the review!

Fixed the typo, rebased on the latest main and retested.

Copy link

GNU testsuite comparison:

Skip an intermittent issue tests/misc/tee (fails in this run but passes in the 'main' branch)
Skipping an intermittent issue tests/tail/inotify-dir-recreate (passes in this run but fails in the 'main' branch)

@RenjiSann RenjiSann merged commit dcc5492 into uutils:main Jun 27, 2025
76 checks passed
@RenjiSann
Copy link
Collaborator

Thank you for your contribution ! 🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

od incorrectly handles non ascii chars
2 participants