Skip to content

Conversation

dstandish
Copy link
Contributor

@dstandish dstandish commented Apr 30, 2025

Use latest bundle version when clearing / re-running dag

When clearing a dag to be rerun, if the bundle is versioned, and bundle versioning is enabled for the dag, we should clear the bundle version (of the dag run) to be the latest bundle version seen for the dag.

I removed the dag param. There's not much point in passing dag in as a param to this function since the TIs can be from different dags, so we have to look it up anyway.

Moreover, they can be from different serdags associated with the same dag -- and that in particular is what makes this complicated and required me to go and use the SchedulerDagBag object.

Most of the ugliness of this PR comes from the fact that it requires that there be serialized dags which we sometimes do not create in our tests -- but which are always going to be there in production. And when I enable them in the test code, then it tends to require changes to the tests.

resolves: #49639

@dstandish dstandish force-pushed the use-latest-bundle-version-when-clearing branch from be9640d to 4bc265d Compare April 30, 2025 21:42
dstandish added a commit to astronomer/airflow that referenced this pull request Apr 30, 2025
Making dag required helps me in apache#50040.  It's always passed in all production code so, we can make the change.  Should be thought of as private anyway.  And making it kwarg only, well why not.
@dstandish dstandish marked this pull request as ready for review May 1, 2025 04:46
@dstandish dstandish requested review from XD-DENG and ashb as code owners May 1, 2025 04:46
@phanikumv phanikumv requested a review from uranusjr May 2, 2025 05:05
@dstandish dstandish force-pushed the use-latest-bundle-version-when-clearing branch from 4bc265d to a9d9a3b Compare May 5, 2025 18:00
@eladkal eladkal added this to the Airflow 3.0.1 milestone May 6, 2025
@eladkal eladkal added the type:bug-fix Changelog: Bug Fixes label May 6, 2025
@dstandish dstandish force-pushed the use-latest-bundle-version-when-clearing branch 2 times, most recently from 2e5cfa9 to 4ff97f8 Compare May 6, 2025 13:07
Copy link
Contributor

@ephraimbuddy ephraimbuddy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good. My worry is that the clear_task_instances was not a private function. Is it ok to change it to a private function now?

@dstandish
Copy link
Contributor Author

This looks good. My worry is that the clear_task_instances was not a private function. Is it ok to change it to a private function now?

Right -- understandable.

My thinking is that it was not intentional to make that a public function, and that I am "fixing" that in this patch release.

@ephraimbuddy
Copy link
Contributor

This looks good. My worry is that the clear_task_instances was not a private function. Is it ok to change it to a private function now?

Right -- understandable.

My thinking is that it was not intentional to make that a public function, and that I am "fixing" that in this patch release.

The function has been around for a while(Airflow 2 or 1 not sure) and it looks like people use it in their DAGs e.g #30237 and #19329 (comment)

@kaxil
Copy link
Member

kaxil commented May 6, 2025

Btw clear_task_instances can't be used in tasks in 3.0+ anyway since it uses DB

@dstandish dstandish added the full tests needed We need to run full set of tests for this PR to merge label May 6, 2025
@dstandish dstandish force-pushed the use-latest-bundle-version-when-clearing branch from 79ac45b to bcaa368 Compare May 6, 2025 18:23
@kaxil kaxil modified the milestones: Airflow 3.0.1, Airflow 3.0.2 May 6, 2025
dstandish added 10 commits May 8, 2025 08:39
When clearing a dag to be rerun, if the bundle is versioned, and bundle versioning is enabled for the dag, we should clear the bundle version (of the dag run) to be the latest bundle version seen for the dag.

I removed the dag param. There's not much point in passing dag in as a param to this function since the TIs can be from different dags, so we have to look it up anyway.

Moreover, they can be from different serdags associated with the same dag -- and that in particular is what makes this complicated and required me to go and use the SchedulerDagBag object.
@dstandish dstandish force-pushed the use-latest-bundle-version-when-clearing branch from 7f19372 to 619938f Compare May 8, 2025 15:57
@uranusjr uranusjr merged commit c069401 into apache:main May 9, 2025
96 checks passed
@uranusjr uranusjr deleted the use-latest-bundle-version-when-clearing branch May 9, 2025 10:26
kaxil pushed a commit that referenced this pull request May 12, 2025
Co-authored-by: Tzu-ping Chung <[email protected]>
(cherry picked from commit c069401)
@dstandish dstandish added the backport-to-v3-0-test Mark PR with this label to backport to v3-0-test branch label May 21, 2025
kaxil pushed a commit that referenced this pull request Jun 3, 2025
Co-authored-by: Tzu-ping Chung <[email protected]>
(cherry picked from commit c069401)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-to-v3-0-test Mark PR with this label to backport to v3-0-test branch full tests needed We need to run full set of tests for this PR to merge type:bug-fix Changelog: Bug Fixes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Ensure that we clear bundle version when clearing a run -- for backfill or otherwise
6 participants