fix(tests): stabilize flaky Hub LFS integration test #7889
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Problem
test_push_dataset_dict_to_hub_overwrite_filesintermittently fails with:This has been causing the
deps-latestintegration tests to fail on main (visible in recent CI runs). I ran into this while working on the BIDS loader PR and dug into the root cause.Root Cause
Two race conditions in the test:
push_to_hubcalls don't wait for Hub to fully propagate LFS objects between pushesSolution
_wait_for_repo_ready()helper that pollslist_repo_filesto ensure the repo is consistent before subsequent operationsds_name_2) for the second scenario, eliminating the delete/create race entirelyTesting
All 4 integration test variants now pass:
ubuntu-latest, deps-latest(was failing)ubuntu-latest, deps-minimumwindows-latest, deps-latest(was failing)windows-latest, deps-minimumValidated on fork: The-Obstacle-Is-The-Way#4
Related
push_to_hubis not concurrency safe (dataset schema corruption) #7600 (push_to_hub concurrency)push_to_hubis not robust to hub closing connection #6392 (push_to_hub connection robustness)cc @lhoestq - small fix but should help CI reliability