-
Notifications
You must be signed in to change notification settings - Fork 15.5k
Improve cache usage for prek hooks #55328
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
I need to measure the gain, but I feel it's worth it |
a90be24
to
a977405
Compare
I'd be interested in the gain. Do we actually have "endless" storage? Do we need to purge this at some time? How can cache be invalidated in case of "The problem is not DNS but cache corruption"? |
cc: @jscheffl
Certainly worth it. The stash / restore action works in the way that it uses artifacts from current and "main" builds if current is not available - and even the "current" build uses initially the "main" build as source of cache. We have automated retention of the cache (each artifact is kept for 2 days ( retention-days: '2'`) - so with our "velocity" that's more than enough :) We can invalidate the cache by bumping the v6 to v7 and so-on -> this how we can force cache invalidation. Also any change to teh .pre-commit-yaml will make the hash change and cache invalidated. In the future when we get monorepo we will hash all the Very good question BTW. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the answers and aaaah, now I see it in the YAML (in German we say "Tomatoes on my eyes..."): retention-days: '2'
and key
for the cache.
Great!
Previously we were only using prek cache from canary builds, but since we are starting to use several separate prek runs, it makes sense to install prek hooks only once and store them in cache and reuse even in the same build. This PR removes "only-canary" prek cache preparation - now all builds including all PRs from fork preapare prek cache once and upload them as artifacts - then restoring prek cache for every prek run should be much faster.
a977405
to
3e47332
Compare
Indeed @assignUser had done a very good job with the action we are using. BTW, Jacob is known as |
path: /tmp/cache-prek.tar.gz | ||
if-no-files-found: 'error' | ||
retention-days: '2' | ||
if: inputs.save-cache == 'true' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we are defaulting to upload cache to artifact for every build, do we need this check?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM :) nice
Previously we were only using prek cache from canary builds, but since we are starting to use several separate prek runs, it makes sense to install prek hooks only once and store them in cache and reuse even in the same build.
This PR removes "only-canary" prek cache preparation - now all builds including all PRs from fork preapare prek cache once and upload them as artifacts - then restoring prek cache for every prek run should be much faster.
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named
{pr_number}.significant.rst
or{issue_number}.significant.rst
, in airflow-core/newsfragments.