-
Notifications
You must be signed in to change notification settings - Fork 536
Description
We do a lot of work on CI, and it's extremely difficult to keep track of how it all fits together. We've also had to deal with a lot of pain arising from the fact that:
- Every workflow on CI is written in untyped YAML
- Most of what CI does can't run locally 1
There are two big changes we can make to drastically improve the situation:
- Make everything on CI runnable locally
- Design a custom DSL and transpile it to GHA YAML files
Local-first CI
Every job first installs it dependencies, and then it runs some code2. We want to ensure that all of that code is also runnable on every developer machine locally. To achieve this, we have to refactor every CI job to be a wrapper over this basic two-step process (install
+ run
).
The sync release assets workflow is a great example of what we want all of our CI to look like. Especially the fact that the inputs to the script are passed in explicitly.
Some notes for the process of extracting a CI job to run locally:
- Installing dependencies should be done using our docker image(s) together with pixi.
- Most third-party actions simply wrap a CLI tool (or multiple), so we should replace their usage with usage of the underlying CLI tool(s)
Codegen GHA away
As long as every job is not much more complex than install
+ run
, it should be possible to ditch the GHA YAML files entirely, and instead use a custom DSL as the input to a GHA YAML file generator. Even if all this code generator did was use a different configuration file format and transpiled it to YAML, it would still be a big improvement in developer experience, but we can do much more than that.
Some (unordered, tentative) goals for this DSL and code generator:
- Strongly typed inputs/outputs for every step of every job
- A better mechanism for code reuse
- Must have the ability to inline code
- Automatic job sequencing
- Given the inputs and outputs of each job, we know the dependencies between jobs, so we can schedule jobs to run in parallel if they don't depend on each other.
- Generate a variant of the same workflow for contributors
- Sanitized and runs on
pull_request
with approval
- Sanitized and runs on
- Local runner with the ability to execute entire workflows E2E
- Every job uses a docker image + pixi, so this doesn't seem too far fetched
- It should also support
--dry-run
to display the execution plan
We don't have to meet all of the above goals. The only strict requirement is that the DSL is not YAML, and it's possible to author the files without deep knowledge of GHA.
We will likely continue to hand-author some workflows with very specific requirements, but this should be usable for all jobs that perform builds/tests/linting.