Skip to content

Conversation

Bomme
Copy link
Contributor

@Bomme Bomme commented Sep 11, 2025

This bumps dill to 0.3.9 and closes #7510

It turns out the only thing required to make the tests pass was to extend the version checks to include 0.3.9.

Bomme and others added 2 commits September 11, 2025 21:35
….9 support (#2)

* Initial plan

* Extract DILL_VERSION check to private function and add 0.3.9 support

Co-authored-by: Bomme <[email protected]>

---------

Co-authored-by: copilot-swe-agent[bot] <[email protected]>
@lhoestq
Copy link
Member

lhoestq commented Sep 12, 2025

Have you tried to run pytest tests/test_fingerprint.py ? It seems dill 0.3.9 breaks a lot of tests

FAILED tests/test_fingerprint.py::TokenizersHashTest::test_hash_regex - NameError: name 'log' is not defined
FAILED tests/test_fingerprint.py::TokenizersHashTest::test_hash_tokenizer - NameError: name 'log' is not defined
FAILED tests/test_fingerprint.py::TokenizersHashTest::test_hash_tokenizer_with_cache - NameError: name 'log' is not defined
FAILED tests/test_fingerprint.py::RecurseHashTest::test_hash_ignores_line_definition_of_function - AssertionError: 'c48ebfacf8768f50' != '27e49d047c02c83b'
FAILED tests/test_fingerprint.py::RecurseHashTest::test_hash_ipython_function - AssertionError: '65edc6b6d425a8e9' != '9f364fe298fb286a'
FAILED tests/test_fingerprint.py::HashingTest::test_hash_tiktoken_encoding - NameError: name 'log' is not defined
FAILED tests/test_fingerprint.py::HashingTest::test_hash_torch_compiled_module - NameError: name 'log' is not defined
FAILED tests/test_fingerprint.py::HashingTest::test_hash_torch_generator - NameError: name 'log' is not defined
FAILED tests/test_fingerprint.py::HashingTest::test_hash_torch_tensor - NameError: name 'log' is not defined
FAILED tests/test_fingerprint.py::HashingTest::test_set_doesnt_depend_on_order - NameError: name 'log' is not defined
FAILED tests/test_fingerprint.py::HashingTest::test_set_stable - NameError: name 'log' is not defined
FAILED tests/test_fingerprint.py::test_move_script_doesnt_change_hash - AssertionError: assert b'93072ca404a697db\n' == b'cf89a7e497a97e32\n'

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@Bomme
Copy link
Contributor Author

Bomme commented Sep 12, 2025

Hi @lhoestq! Yes, I did. It's not really dill that breaks things. Rather the shims that datasets has in place did not include the next version.

FYI: I also tested it with dill-0.4.0 and the changes would need to be analogous, but I wanted to be conservative in this PR.

@lhoestq
Copy link
Member

lhoestq commented Sep 12, 2025

The NameError is fixed in your PR since it defines the right log() function for 0.3.9.

But I'm less sure about the AssertionError that may be related to deterministic hashing or ipython/shell function hashing. We would need to solve these

EDIT: ah actually it does ! cool ! let me update the branch and re-run the CI

Copy link
Member

@lhoestq lhoestq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm :)

could you also add the same for 0.4.0 before we merge ?

the CI failures are unrelated to this PR

@Bomme Bomme changed the title Bump dill to 0.3.9 Bump dill to 0.4.0 Sep 12, 2025
@lhoestq lhoestq merged commit a3a1069 into huggingface:main Sep 15, 2025
10 of 14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Incompatibile dill version (0.3.9) in datasets 2.18.0 - 3.5.0
3 participants