-
Notifications
You must be signed in to change notification settings - Fork 3k
Transfer CI to GitHub Actions #4659
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Transfer CI to GitHub Actions #4659
Conversation
|
The documentation is not available anymore as the PR was closed or merged. |
| flake8 tests src benchmarks datasets metrics | ||
| test: | ||
| needs: check_code_quality |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The check_code_quality job must complete successfully before the test jobs will run.
This saves CI running time (billing and environmental impact).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I would still run the tests even if the code quality check doesn't succeed. This will avoid having to review a first time to tell the user to fix code quality, and then later review a second time to fix the tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lhoestq I'm not sure of understanding your point.
- First, contributors should run tests locally and fix them locally.
- If they forget, then the CI tells them there is a code quality issue, so that they can fix it by themselves, push and trigger the CI again (which once the code quality issue is fixed, will then run all the tests as well)
- Normally, maintainers should review only green PRs (unless pingged for help): it's the contributor responsibility to make it pass the CI
- And finally, I think (let me know if you do not think the same) a very common issue while contributing is running tests locally but forgetting running the code quality check (after multiple pushes): in these cases, it does not make sense to run all the tests (expensive) just because there is a non-used import, for example (that is not catch by
make style, BTW, but for the more expensivemake qualityonly); once the code quality issue is fixed, the new push will make all tests run anyway
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the specific use case you mention, the reviewer should say to the contributor: you should run locally both the code quality checks and the tests, before pushing
| strategy: | ||
| matrix: | ||
| os: [ubuntu-latest, windows-latest] | ||
| pyarrow_version: [latest, 6.0.1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The strategy (cross product matrix of OS and PyArrow versions) defaults to True fail-fast: GitHub will cancel all in-progress and queued jobs in the matrix if any job in the matrix fails.
This saves CI running time (billing).
Also note here we use PyArrow 6.0.1 (instead of 6.0.0 used in CircleCI).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's nice if the linux test keep running when the windows test fails, because the challenge is always to make them work together at the same time, not one independently of the other.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the Windows tests fails, a fix is necessary.
And once fixed, the CI will be triggered again. Eventually ALL tests will be run.
The CI will only be green if and only if ALL test jobs run and pass.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lhoestq
Another typical example, besides the one above about the contributor forgetting passing locally the code quality check:
- Usually, if there is a test not passing on Linux, it will not pass on Windows either (nearly for sure)
- Therefore, if the tests are not passing on ubunut-latest, what is the point of continue running the tests on windows-latest?
- Note Windows takes 3x more time to run than Linux and costs twice the price per minute than Linux
- Most of the times, the contributors will fix and push (triggering the CI again) even before the previous CI on Windows has finished
- Therefore, if the tests are not passing on ubunut-latest, what is the point of continue running the tests on windows-latest?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense ! I think it's ok in the general case I agree. Let's try this :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm actually it happens so often to have a CI that fails (because of flakiness or because of dataset card content missing) that I wouldn't stop all the jobs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not against trying this CI, but if it's annoying we'll have to disable the CI matrix strategy
It would be super annoying to have to relaunch the CI every time we get a 400/502 error from the Hub for example. And this happens often (just check the main branch)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For sure, our golden target should be to have a CI so that we only merge a PR if and only if the CI is green. I'm strongly against having a CI that we can ignore sometimes and merge even if it is red.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well this is currently the case and having the matrix strategy is going to be annoying I think. I personnally don't want to lose time relaunching CI jobs just because of some random error from the Hub.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is currently the case because our test suite is badly implemented:
- currently we do not differentiate between unit and integration tests: all are mixed up
- unit tests should be really "unit" and not depend on external services; they should be fast and these are the ones that should be triggered on each push (normally tenths of times for each PR)
- we should clearly differentiate integration tests, which could request external web services, among other things, fdifferently from unit tests.
- these should be triggered only once per PR, just before merging it
- these should be robust and implement retries in case of random errors given by the Hub
- we should only merge a PR when al the test suite is green, without exception: I insist it is a really bad habit merge PRs in red just because our test suite is not well implemented
Anyway, I am removing the fast-fail policy despite being completely against it, so that we can continue merging PRs in red. I'm commenting on this so that it is clearly stated.
| - uses: actions/checkout@v3 | ||
| with: | ||
| fetch-depth: 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note fetch-depth: 0 means to fetch all history commits for all branches and tags (instead of default, just the merge commit of current branch to main branch).
The reason for this is to be able to collect the test tests/test_dataset_cards.py::test_changed_dataset_card, because these are parametrized with the function get_changed_datasets and its implementation requires both current branch and origin/main branch: origin/main...HEAD
diff_output = check_output(["git", "diff", "--name-only", "origin/main...HEAD"], cwd=repo_path)Otherwise, an error is raised:
_________________ ERROR collecting tests/test_dataset_cards.py _________________
tests/test_dataset_cards.py:53: in <module>
@pytest.mark.parametrize("dataset_name", get_changed_datasets(repo_path))
tests/test_dataset_cards.py:35: in get_changed_datasets
diff_output = check_output(["git", "diff", "--name-only", "origin/main...HEAD"], cwd=repo_path)
/opt/hostedtoolcache/Python/3.6.15/x64/lib/python3.6/subprocess.py:356: in check_output
**kwargs).stdout
/opt/hostedtoolcache/Python/3.6.15/x64/lib/python3.6/subprocess.py:438: in run
output=stdout, stderr=stderr)
E subprocess.CalledProcessError: Command '['git', 'diff', '--name-only', 'origin/main...HEAD']' returned non-zero exit status 128.
because of:
fatal: ambiguous argument 'origin/main...HEAD': unknown revision or path not in the working tree.
Use '--' to separate paths from revisions, like this:
'git <command> [<revision>...] -- [<file>...]'
| os.name == "nt" and (os.getenv("CIRCLECI") == "true" or os.getenv("GITHUB_ACTIONS") == "true"), | ||
| reason='On Windows CircleCI or GitHub Actions, it raises botocore.exceptions.EndpointConnectionError: Could not connect to the endpoint URL: "http://127.0.0.1:5555/test"', | ||
| ) # TODO: find what's wrong with CircleCI / GitHub Actions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As in CircleCI, this test must be skipped in GitHub Actions as well to avoid an error.
| os.name == "nt" and (os.getenv("CIRCLECI") == "true" or os.getenv("GITHUB_ACTIONS") == "true"), | ||
| reason='On Windows CircleCI or GitHub Actions, it raises botocore.exceptions.EndpointConnectionError: Could not connect to the endpoint URL: "http://127.0.0.1:5555/test"', | ||
| ) # TODO: find what's wrong with CircleCI / GitHub Actions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above.
| if: ${{ matrix.os == 'ubuntu-latest' }} | ||
| run: | | ||
| sudo apt-get -y update | ||
| sudo apt-get -y install libsndfile1 sox |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previous OS dependency libsndfile1-dev is replaced with libsndfile1.
The sox dependency is added (required by MP3 tests). This makes passing 2 more tests than before.
|
Thanks a lot @albertvillanova ! I hope we're finally done with flakiness on windows ^^ Also thanks for paying extra attention to billing and avoiding running unnecessary jobs. Though for certain aspects (see my comments), I think it's worth having the extra jobs to make our life easier |
|
I had missed it among all the other comments... |
|
@lhoestq, I'm specially enthusiastic with the fail-fast policy: it was in my TODO list for a long time. I really think it will have a positive impact (I would love to know the spent time saving it will enable, besides the carbon footprint reduction). 😉 So yes, as you said above, let's give it a try at least. If we encounter any inconvenience, we can easily disable it. Question: I guess I have to disable CircleCI CI before merging this PR? |
This PR transfers CI from CircleCI to GitHub Actions. The implementation in GitHub Actions tries to be as faithful as possible to the implementation in CircleCI and get the same output results (exceptions below).
IMPORTANT NOTE: The fast-fail policy (described below) is not finally implemented, so that:
See comments here: #4659 (comment)
Differences in the implementation in GitHub Actions compared to the CircleCI one:
This PR introduces some fail-fast mechanisms to significantly reduce the total time CI is running, both because of environmental impact and because CI in GitHub Actions billing depends on the minutes per month running time (see About billing for GitHub Actions):
check_code_qualityjob: only ifcheck_code_qualitypasses, then the other test jobs are launchedOS dependencies for Linux (see table below)
libsndfile1-devwithlibsndfile1: the same number of passing tests but less packages installedsox: required by MP3 tests (2 more tests are passed: 4788 instead of 4786)For tests using PyArrow 6, this PR uses 6.0.1 instead of 6.0.0
TO DO:
Close #4658.
Comparison between CircleCI and GitHub Actions