Skip to content

Conversation

@albertvillanova
Copy link
Member

@albertvillanova albertvillanova commented Jul 7, 2022

This PR transfers CI from CircleCI to GitHub Actions. The implementation in GitHub Actions tries to be as faithful as possible to the implementation in CircleCI and get the same output results (exceptions below).

IMPORTANT NOTE: The fast-fail policy (described below) is not finally implemented, so that:

  • we can continue merging PRs with CI in red because of some random error returned by the Hub
  • it is not annoying for maintainers to have to relaunch failed CI jobs

See comments here: #4659 (comment)

Differences in the implementation in GitHub Actions compared to the CircleCI one:

  • This PR introduces some fail-fast mechanisms to significantly reduce the total time CI is running, both because of environmental impact and because CI in GitHub Actions billing depends on the minutes per month running time (see About billing for GitHub Actions):

    • All tests depend on check_code_quality job: only if check_code_quality passes, then the other test jobs are launched
    • The tests are implemented with a matrix strategy (cross-product: OS and PyArrow versions) and fail-fast: if any of the 4 processes fails, the others are cancelled
  • OS dependencies for Linux (see table below)

    OS dependencies Passed tests Skipped tests
    libsndfile1-dev 4786 3119
    libsndfile1 4786 3119
    libsndfile1, sox 4788 3117
    • This PR replaces libsndfile1-dev with libsndfile1: the same number of passing tests but less packages installed
    • This PR adds sox: required by MP3 tests (2 more tests are passed: 4788 instead of 4786)
  • For tests using PyArrow 6, this PR uses 6.0.1 instead of 6.0.0

TO DO:

  • Remove old CircleCI CI: kept for the moment to compare stability and performance

Close #4658.

Comparison between CircleCI and GitHub Actions

CircleCI GitHub Actions
Ubuntu, pyarrow-latest
Passed tests 4786 4788
Duration 11m 0s 10m 10s
Windows, pyarrow-latest
Passed tests 4783 4783
Duration 29m 59s 22m 56s

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Jul 7, 2022

The documentation is not available anymore as the PR was closed or merged.

@albertvillanova albertvillanova marked this pull request as ready for review July 7, 2022 16:59
flake8 tests src benchmarks datasets metrics
test:
needs: check_code_quality
Copy link
Member Author

@albertvillanova albertvillanova Jul 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The check_code_quality job must complete successfully before the test jobs will run.

This saves CI running time (billing and environmental impact).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I would still run the tests even if the code quality check doesn't succeed. This will avoid having to review a first time to tell the user to fix code quality, and then later review a second time to fix the tests

Copy link
Member Author

@albertvillanova albertvillanova Jul 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lhoestq I'm not sure of understanding your point.

  • First, contributors should run tests locally and fix them locally.
  • If they forget, then the CI tells them there is a code quality issue, so that they can fix it by themselves, push and trigger the CI again (which once the code quality issue is fixed, will then run all the tests as well)
  • Normally, maintainers should review only green PRs (unless pingged for help): it's the contributor responsibility to make it pass the CI
  • And finally, I think (let me know if you do not think the same) a very common issue while contributing is running tests locally but forgetting running the code quality check (after multiple pushes): in these cases, it does not make sense to run all the tests (expensive) just because there is a non-used import, for example (that is not catch by make style, BTW, but for the more expensive make quality only); once the code quality issue is fixed, the new push will make all tests run anyway

Copy link
Member Author

@albertvillanova albertvillanova Jul 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the specific use case you mention, the reviewer should say to the contributor: you should run locally both the code quality checks and the tests, before pushing

Comment on lines 34 to 37
strategy:
matrix:
os: [ubuntu-latest, windows-latest]
pyarrow_version: [latest, 6.0.1]
Copy link
Member Author

@albertvillanova albertvillanova Jul 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The strategy (cross product matrix of OS and PyArrow versions) defaults to True fail-fast: GitHub will cancel all in-progress and queued jobs in the matrix if any job in the matrix fails.

This saves CI running time (billing).

Also note here we use PyArrow 6.0.1 (instead of 6.0.0 used in CircleCI).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's nice if the linux test keep running when the windows test fails, because the challenge is always to make them work together at the same time, not one independently of the other.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the Windows tests fails, a fix is necessary.

And once fixed, the CI will be triggered again. Eventually ALL tests will be run.

The CI will only be green if and only if ALL test jobs run and pass.

Copy link
Member Author

@albertvillanova albertvillanova Jul 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lhoestq
Another typical example, besides the one above about the contributor forgetting passing locally the code quality check:

  • Usually, if there is a test not passing on Linux, it will not pass on Windows either (nearly for sure)
    • Therefore, if the tests are not passing on ubunut-latest, what is the point of continue running the tests on windows-latest?
      • Note Windows takes 3x more time to run than Linux and costs twice the price per minute than Linux
    • Most of the times, the contributors will fix and push (triggering the CI again) even before the previous CI on Windows has finished

Copy link
Member

@lhoestq lhoestq Jul 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense ! I think it's ok in the general case I agree. Let's try this :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm actually it happens so often to have a CI that fails (because of flakiness or because of dataset card content missing) that I wouldn't stop all the jobs

Copy link
Member

@lhoestq lhoestq Jul 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not against trying this CI, but if it's annoying we'll have to disable the CI matrix strategy

It would be super annoying to have to relaunch the CI every time we get a 400/502 error from the Hub for example. And this happens often (just check the main branch)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For sure, our golden target should be to have a CI so that we only merge a PR if and only if the CI is green. I'm strongly against having a CI that we can ignore sometimes and merge even if it is red.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well this is currently the case and having the matrix strategy is going to be annoying I think. I personnally don't want to lose time relaunching CI jobs just because of some random error from the Hub.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is currently the case because our test suite is badly implemented:

  • currently we do not differentiate between unit and integration tests: all are mixed up
  • unit tests should be really "unit" and not depend on external services; they should be fast and these are the ones that should be triggered on each push (normally tenths of times for each PR)
  • we should clearly differentiate integration tests, which could request external web services, among other things, fdifferently from unit tests.
    • these should be triggered only once per PR, just before merging it
    • these should be robust and implement retries in case of random errors given by the Hub
    • we should only merge a PR when al the test suite is green, without exception: I insist it is a really bad habit merge PRs in red just because our test suite is not well implemented

Anyway, I am removing the fast-fail policy despite being completely against it, so that we can continue merging PRs in red. I'm commenting on this so that it is clearly stated.

Comment on lines +45 to +47
- uses: actions/checkout@v3
with:
fetch-depth: 0
Copy link
Member Author

@albertvillanova albertvillanova Jul 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note fetch-depth: 0 means to fetch all history commits for all branches and tags (instead of default, just the merge commit of current branch to main branch).

The reason for this is to be able to collect the test tests/test_dataset_cards.py::test_changed_dataset_card, because these are parametrized with the function get_changed_datasets and its implementation requires both current branch and origin/main branch: origin/main...HEAD

diff_output = check_output(["git", "diff", "--name-only", "origin/main...HEAD"], cwd=repo_path)

Otherwise, an error is raised:

_________________ ERROR collecting tests/test_dataset_cards.py _________________
tests/test_dataset_cards.py:53: in <module>
    @pytest.mark.parametrize("dataset_name", get_changed_datasets(repo_path))
tests/test_dataset_cards.py:35: in get_changed_datasets
    diff_output = check_output(["git", "diff", "--name-only", "origin/main...HEAD"], cwd=repo_path)
/opt/hostedtoolcache/Python/3.6.15/x64/lib/python3.6/subprocess.py:356: in check_output
    **kwargs).stdout
/opt/hostedtoolcache/Python/3.6.15/x64/lib/python3.6/subprocess.py:438: in run
    output=stdout, stderr=stderr)
E   subprocess.CalledProcessError: Command '['git', 'diff', '--name-only', 'origin/main...HEAD']' returned non-zero exit status 128.

because of:

fatal: ambiguous argument 'origin/main...HEAD': unknown revision or path not in the working tree.
Use '--' to separate paths from revisions, like this:
'git <command> [<revision>...] -- [<file>...]'

Comment on lines +3122 to +3124
os.name == "nt" and (os.getenv("CIRCLECI") == "true" or os.getenv("GITHUB_ACTIONS") == "true"),
reason='On Windows CircleCI or GitHub Actions, it raises botocore.exceptions.EndpointConnectionError: Could not connect to the endpoint URL: "http://127.0.0.1:5555/test"',
) # TODO: find what's wrong with CircleCI / GitHub Actions
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As in CircleCI, this test must be skipped in GitHub Actions as well to avoid an error.

Comment on lines +668 to +670
os.name == "nt" and (os.getenv("CIRCLECI") == "true" or os.getenv("GITHUB_ACTIONS") == "true"),
reason='On Windows CircleCI or GitHub Actions, it raises botocore.exceptions.EndpointConnectionError: Could not connect to the endpoint URL: "http://127.0.0.1:5555/test"',
) # TODO: find what's wrong with CircleCI / GitHub Actions
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above.

@albertvillanova albertvillanova requested a review from lhoestq July 8, 2022 09:12
if: ${{ matrix.os == 'ubuntu-latest' }}
run: |
sudo apt-get -y update
sudo apt-get -y install libsndfile1 sox
Copy link
Member Author

@albertvillanova albertvillanova Jul 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previous OS dependency libsndfile1-dev is replaced with libsndfile1.
The sox dependency is added (required by MP3 tests). This makes passing 2 more tests than before.

@lhoestq
Copy link
Member

lhoestq commented Jul 8, 2022

Thanks a lot @albertvillanova ! I hope we're finally done with flakiness on windows ^^

Also thanks for paying extra attention to billing and avoiding running unnecessary jobs. Though for certain aspects (see my comments), I think it's worth having the extra jobs to make our life easier

@albertvillanova
Copy link
Member Author

albertvillanova commented Jul 8, 2022

@lhoestq I think you forgot to add your comments?

I had missed it among all the other comments...

@albertvillanova
Copy link
Member Author

albertvillanova commented Jul 8, 2022

@lhoestq, I'm specially enthusiastic with the fail-fast policy: it was in my TODO list for a long time. I really think it will have a positive impact (I would love to know the spent time saving it will enable, besides the carbon footprint reduction). 😉

So yes, as you said above, let's give it a try at least. If we encounter any inconvenience, we can easily disable it.

Question: I guess I have to disable CircleCI CI before merging this PR?

@albertvillanova albertvillanova changed the title Transfer CI tests to GH Actions Transfer CI to GH Actions Jul 12, 2022
@albertvillanova albertvillanova changed the title Transfer CI to GH Actions Transfer CI to GitHub Actions Jul 12, 2022
@albertvillanova albertvillanova merged commit f3b6697 into huggingface:main Jul 12, 2022
@albertvillanova albertvillanova deleted the ci-gh-actions branch July 12, 2022 11:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Transfer CI tests to GitHub Actions

3 participants