Skip to content

Conversation

dafnapension
Copy link
Collaborator

@dafnapension dafnapension commented Apr 5, 2025

__type__ in catalog is expressed as a dict {module: module, name: class_name}, therefrom classes are instantiated through python's import utils.
This means that if a class c is defined in some file that sits in path p, and c is referenced in a catalog entry, and since by this PR, c is referenced through p, then this reference coerces the defining file p to stay in place in the file system. Same coercion that is induced by a line of code reading from p import c. If the defining file moves in the file system, the reference in the catalog should be updated, same as any import line (as above) should be updated.

Backward Compatibility:
(1) A utility, utils/prepare_all_artifacts.py is provided which transforms a given catalog to the new format, by running the set of prepare modules. Needs to be invoked once per project.

(2) Also if not converted to the new format, a given catalog in the old format (where __type__ has a string value, a snake case of the class name) can be read and worked with by the PR's code: the code translates the __type__ upon loading from the catalog to the dict format. This is effective for all __type__ that refer to unitxt classes (classes in the unitxt/src/unitxt directory). Yet to be developed: on-the-air translation of __type__ that refer to user-defined classes.

@dafnapension dafnapension force-pushed the json branch 2 times, most recently from d9425b2 to 3c9f7fa Compare April 6, 2025 20:22
@elronbandel
Copy link
Member

For: #1575

@dafnapension dafnapension force-pushed the json branch 10 times, most recently from 54efa18 to 137df10 Compare April 9, 2025 20:51
@dafnapension dafnapension changed the title __type__ in catalog to be expressed as qualified class name, registry maps each such qualified class name to itself, toward removal of registry __type__ in catalog is expressed as qualified class name, thereby classes are instantiated from it through python's import. Artifact.class_register is removed altogether Apr 9, 2025
@dafnapension dafnapension changed the title __type__ in catalog is expressed as qualified class name, thereby classes are instantiated from it through python's import. Artifact.class_register is removed altogether For issue 1575: Eliminating Manual Class Registration in Unitxt with Import Paths Apr 10, 2025
@dafnapension dafnapension changed the title For issue 1575: Eliminating Manual Class Registration in Unitxt with Import Paths For issue 1575: Eliminating Manual Class Registration in Unitxt, replaced by Import Paths Apr 10, 2025
@dafnapension dafnapension force-pushed the json branch 2 times, most recently from 83458db to 5c835c1 Compare April 10, 2025 20:27
@dafnapension dafnapension force-pushed the json branch 4 times, most recently from 1ef8d61 to 663239e Compare May 4, 2025 19:18
@dafnapension dafnapension force-pushed the json branch 3 times, most recently from 429047d to f3e6412 Compare May 20, 2025 15:42
@dafnapension
Copy link
Collaborator Author

dafnapension commented May 20, 2025

(virtual310) dafna@LAPTOP-ICP8MAPV:~/workspaces/unitxt$ git diff main...json --name-only | grep -v /catalog/
.github/workflows/catalog_consistency.yml
docs/catalog.py
docs/conf.py
prepare/cards/mtrag.py
prepare/metrics/custom_f1.py
prepare/tasks/qa/tasks.py
src/unitxt/artifact.py
src/unitxt/catalog.py
src/unitxt/dataset_utils.py
src/unitxt/deprecation_utils.py
src/unitxt/register.py
src/unitxt/settings_utils.py
src/unitxt/text_utils.py
tests/library/test_artifact.py
tests/library/test_artifact_recovery.py
tests/library/test_artifact_registration.py
tests/library/test_catalogs.py
tests/library/test_function_operators.py
tests/library/test_recipe.py
tests/library/test_text_utils.py
utils/check_catalog_consistency.py
utils/prepare_all_artifacts.py
(virtual310) dafna@LENAHUVA:~/workspaces/unitxt$

@dafnapension dafnapension force-pushed the json branch 3 times, most recently from b0fc597 to d576e03 Compare May 20, 2025 20:31
dafnapension and others added 4 commits September 21, 2025 15:11
… and name, rather than snake of class name, and use _class_register only for special cases (like deprecated classes)

Signed-off-by: dafnapension <[email protected]>
Signed-off-by: elronbandel <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants