-
Notifications
You must be signed in to change notification settings - Fork 18
Use ./pants bootstrap-cache-key for more reliable setup caching #6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The setup cache includes symlinks to things outside the cache, such as the (system) Python executable. Thus, the cache should only be reused on machines where those parts of the system exactly match up.
init-pants/action.yaml
Outdated
| PANTS_VERSION=$(grep -r '^pants_version\s*=' pants.toml) | ||
| echo "::set-output name=pants_version::$PANTS_VERSION" | ||
| PYTHON_VERSION=$(which python) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, pyenv will break this, as which will return a path to a shim, which can be different underlying interpreters at different times.
The only reliable thing to do is actually run the python interpreter and get it to report its version.
But a bigger issue here is that the version returned by which python may not be the one the pants script selects, I think. That script has its own interpreter selection logic. And folks can override that logic anyway.
So we may need to have a way for Pants itself to report the interpreter it's running on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, hm, I think there's potentially two dimensions of issues for a CI cache here:
- the symlinks to the chosen Python executable (installed somewhere on the system outside Pants' control) in
~/.cache/pants/setup/.../install/bin/ - the python version matching
The setup-python action for GHA hits both, since the system Python file path ends up containing the exact python version, and thus changes to the GHA runner config can lead to those symlinks breaking.
So we may need to have a way for Pants itself to report the interpreter it's running on.
Maybe I misunderstand what you're imagining, but I think this will be too late for the setup cache? As in, CI will need to be able to extract this info before bootstrapping pants, or else it'll need to bootstrap before it can determine the cache to use to save on the bootstrapping time...
Some possible options here (just me brainstorming):
- make sure this
init-pantsaction to works with the current behaviour of thesetup-pythonaction, without necessarily handling cases that might come up in other environments (e.g. pyenv/local dev), meaning thewhich-based approach of this PR as written may be okay - adjust to something that'll handle more cases of python being a shim
Suggested change
PYTHON_VERSION=$(which python) PYTHON_VERSION=$(python -c 'import sys; print(sys.executable)') - something even fancier, like having the pants shell script able to report this sort of info before bootstrapping (e.g.
./pants --bootstrap-cache-keyor something, that'd print some 'opaque' string likepants_version_2.13.0-python_path_/usr/...and whatever other info is required), which could package up the TOML file parsing too. I don't know how easy it is to propagate changes to that script though.
Since this init-pants action will only ever run in GitHub actions, 1 seems fine to me, but I'm more than happy to take guidance about other approaches.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem is that whatever python is on the path here may not be the one the pants script selects.
So yeah, we may have to have a ./pants cmd that returns the full path to intepreter it selects. This would actually be useful in general, when debugging pants bootstrapping issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In other words, your 3. seems like the way to go.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool! I've proposed a possible step towards 3 in pantsbuild/setup#128. I'm imagining this action is then updated to do something like:
- name: Get Pants bootstrap cache key
id: pants_version
shell: bash
run: |
if ! grep 'PANTS_BOOTSTRAP_TOOLS' ./pants; then
echo "This action requires a newer ./pants script. <update instructions here...>"
exit 1
fi
PANTS_BOOTSTRAP_CACHE_KEY=$(PANTS_BOOTSTRAP_TOOLS=1 ./pants bootstrap-cache-key)
echo "::set-output name=pants_bootstrap_cache_key::$PANTS_BOOTSTRAP_CACHE_KEY"
- name: Cache Pants Setup
id: cache-pants-setup
uses: actions/cache@v2
with:
path: |
~/.cache/pants/setup
key: pants-setup-${{ steps.pants_version.outputs.pants_bootstrap_cache_key }}There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've made that update, and validated on our repo (seems to work, including the too-old pants script detection).
I've also added a drive-by fix to not run the base-commit look-up when it isn't going to be used, since it was breaking in our repo (due to a lack of gh script, I think).
This expands on #128 to add another bootstrap tool subcommand: `bootstrap-version`. This is designed to just be a simple number that's bumped whenever there's a feature scripts might need to query for. The version number is also automatically checked against the (numeric) value of the `PANTS_BOOTSTRAP_TOOLS` key. For instance, suppose the bootstrap tools already existed (at version 1), and then `bootstrap-cache-key` command was added, the version number would be bumped to 2, and then a consumer (such as pantsbuild/actions#6) could run something like: ```shell PANTS_BOOTSTRAP_CACHE_KEY=$(PANTS_BOOTSTRAP_TOOLS=2 ./pants bootstrap-cache-key) ```
Github updated the default minor version of python 3.9.x The cache key did not include the minor version which broke things for us. Luckily, this was already fixed in pantsbuild/actions#6. The fix requires an update to the `./pants` entrypoint/bootstrap script which was updated in pantsbuild/setup#128. updated `./pants` with: curl -L -O https://static.pantsbuild.org/setup/pants && chmod +x ./pants
update GHA pants-init action to fix build Github updated the default minor version of python 3.9.x The cache key did not include the minor version which broke things for us. Luckily, this was already fixed in pantsbuild/actions#6. The fix requires an update to the `./pants` entrypoint/bootstrap script which was updated in pantsbuild/setup#128. updated `./pants` with: curl -L -O https://static.pantsbuild.org/setup/pants && chmod +x ./pants
The setup cache includes symlinks to things outside the cache, such as the (system) Python executable. Thus, the cache should only be reused on machines where those parts of the system exactly match up. This switches from trying to compute an appropriate cache key externally to letting the pants bootstrap script do it itself, via the new
PANTS_BOOTSTRAP_TOOLS=1 ./pants bootstrap-cache-key(pantsbuild/setup#128).This is a breaking change, in that it requires updating the checked in pants scripts to support the new tool.
Fixes #5