Skip to content

Conversation

@The-Obstacle-Is-The-Way
Copy link
Owner

@The-Obstacle-Is-The-Way The-Obstacle-Is-The-Way commented Nov 29, 2025

Summary

  • Add _wait_for_repo_ready helper to ensure Hub repository consistency between operations
  • Fix race condition in test_push_dataset_dict_to_hub_overwrite_files that caused intermittent LFS pointer errors

Problem

test_push_dataset_dict_to_hub_overwrite_files intermittently fails with:

BadRequestError: LFS pointer pointed to a file that does not exist

This occurs because rapid successive push_to_hub calls don't wait for Hub's LFS object propagation between pushes.

Solution

Add synchronization delay between push operations to ensure Hub has fully processed LFS objects before the next push.

Test plan

  • Ran test locally - passes consistently
  • CI passes on this PR
  • Verify fix works on both Ubuntu and Windows

Related Issues


This fix is intended for upstream submission to huggingface/datasets after validation.

Summary by CodeRabbit

  • Tests
    • Improved test reliability by adding explicit, time-bound readiness checks and retry logic to ensure repository artifacts (including large file objects) finish propagating after pushes and uploads.
    • Integrated these readiness waits into test flows after pushes and before dependent actions, replacing fragile timing-based sequencing and reducing race conditions and flakiness.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link

coderabbitai bot commented Nov 29, 2025

Warning

Rate limit exceeded

@The-Obstacle-Is-The-Way has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 0 minutes and 3 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between e98efc8 and 109720e.

📒 Files selected for processing (1)
  • tests/test_upstream_hub.py (6 hunks)

Walkthrough

Adds a _wait_for_repo_ready(self, repo_id, max_wait=30) helper to TestPushToHub that polls list_repo_files, handles transient HfHubHTTPErrors using time.monotonic(), and raises TimeoutError on timeout. Calls to this helper replace timing-based sequencing after pushes to ensure Hub/LFS propagation before further actions.

Changes

Cohort / File(s) Change Summary
Test infrastructure — repository readiness polling
tests/test_upstream_hub.py
Added _wait_for_repo_ready(self, repo_id, max_wait=30) to TestPushToHub that retries list_repo_files, catches HfHubHTTPError, uses time.monotonic() for timeout tracking, and raises TimeoutError on timeout. Inserted calls to this helper after push/upload operations and before dependent actions to synchronize Hub/LFS propagation, replacing previous timing-based sequencing.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

  • Review retry loop timing and any sleep/backoff behavior.
  • Verify correct import/handling and scoping of HfHubHTTPError.
  • Confirm helper call placements across test scenarios don't mask genuine race conditions or introduce undue test slowdowns.

Poem

🐰 I hop and poll the hub so bright,
Counting seconds by monotonic light,
Waiting for LFS and files to land,
I nudge the tests with a patient hand,
Now pushes settle — all tests take flight. ✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'fix(tests): stabilize flaky Hub LFS integration test' clearly describes the main change: fixing flaky Hub LFS integration tests through stabilization efforts.

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
tests/test_upstream_hub.py (3)

270-281: Polling helper for Hub consistency looks solid; only minor nits possible

The retry loop around list_repo_files with a bounded wait and explicit TimeoutError is a good fit to make the test deterministic and fail loudly if the Hub is slow or inconsistent. Catching Exception is acceptable in this test-only helper, and the extra sleep(1) after a successful probe gives LFS a bit more buffer.

If you ever need finer control, a small optional refinement would be:

  • Track elapsed time via time.monotonic() so max_wait is strictly “seconds” rather than “iterations”.
  • Optionally narrow the except to hub-related errors to avoid masking unexpected issues, or at least log them.

Not blockers; current implementation is fine for the intended use.


294-313: Well-placed waits between first push, manual upload, and second push

Adding _wait_for_repo_ready(ds_name) after the initial push_to_hub and again after the upload_file call is a reasonable way to avoid the LFS propagation race that was causing the flaky BadRequestError. The helper is scoped to this test class and only used here, so the additional latency is limited to this one integration test.

If further flakes appear, a non-blocking follow-up could be to:

  • Also use _wait_for_repo_ready after the second push_to_hub before asserting on list_repo_files, to guard against any remaining backend lag.
    But given the current problem description, the added waits should already address the primary race.

345-364: Symmetric guarding of the second overwrite scenario looks appropriate

Mirroring the same wait pattern in the second half of test_push_dataset_dict_to_hub_overwrite_files (push → wait → upload → wait → push) keeps both branches consistent and should prevent the same LFS race when the second run changes the shard count in the opposite direction.

No issues from a correctness standpoint. As with the first block, you might consider adding a final readiness wait before the concluding load_dataset if you ever see residual flakiness, but it's not strictly necessary right now.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 004a5bf and b6490dc.

📒 Files selected for processing (1)
  • tests/test_upstream_hub.py (5 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (13)
  • GitHub Check: test (unit, windows-latest, deps-minimum)
  • GitHub Check: test (integration, windows-latest, deps-latest)
  • GitHub Check: test (unit, ubuntu-latest, deps-minimum)
  • GitHub Check: test (integration, windows-latest, deps-minimum)
  • GitHub Check: test (unit, windows-latest, deps-latest)
  • GitHub Check: test (integration, ubuntu-latest, deps-minimum)
  • GitHub Check: test_py314_future (unit, windows-latest, deps-latest)
  • GitHub Check: test (unit, ubuntu-latest, deps-latest)
  • GitHub Check: test (integration, ubuntu-latest, deps-latest)
  • GitHub Check: test_py314 (unit, ubuntu-latest, deps-latest)
  • GitHub Check: test_py314_future (unit, ubuntu-latest, deps-latest)
  • GitHub Check: test_py314 (unit, windows-latest, deps-latest)
  • GitHub Check: build / build_pr_documentation

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
tests/test_upstream_hub.py (1)

270-296: Reasonable approach to stabilize flaky Hub tests.

The helper addresses the race condition with a polling + sleep strategy. A few observations:

  1. Exception specificity: Catching all HfHubHTTPError is broad. Consider logging transient errors for debugging intermittent CI failures.

  2. Import placement: Moving the import to the top of the file (alongside other huggingface_hub imports on line 15) would be more idiomatic.

  3. Delay tuning: The 1-second post-success sleep is a heuristic. If flakiness persists, this may need adjustment or a more robust verification (e.g., retrying the subsequent operation with backoff rather than pre-waiting).

Consider moving the import to the file header:

-from huggingface_hub import DatasetCard, HfApi
+from huggingface_hub import DatasetCard, HfApi
+from huggingface_hub.errors import HfHubHTTPError

Then remove line 284 from within the method.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b6490dc and fab2e85.

📒 Files selected for processing (1)
  • tests/test_upstream_hub.py (5 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (13)
  • GitHub Check: test (integration, ubuntu-latest, deps-minimum)
  • GitHub Check: test (unit, windows-latest, deps-latest)
  • GitHub Check: test (integration, windows-latest, deps-minimum)
  • GitHub Check: test (integration, windows-latest, deps-latest)
  • GitHub Check: test (unit, windows-latest, deps-minimum)
  • GitHub Check: test (integration, ubuntu-latest, deps-latest)
  • GitHub Check: test (unit, ubuntu-latest, deps-minimum)
  • GitHub Check: test (unit, ubuntu-latest, deps-latest)
  • GitHub Check: test_py314_future (unit, ubuntu-latest, deps-latest)
  • GitHub Check: test_py314_future (unit, windows-latest, deps-latest)
  • GitHub Check: test_py314 (unit, windows-latest, deps-latest)
  • GitHub Check: test_py314 (unit, ubuntu-latest, deps-latest)
  • GitHub Check: build / build_pr_documentation
🔇 Additional comments (4)
tests/test_upstream_hub.py (4)

309-311: LGTM!

Appropriate placement after the initial push to ensure Hub consistency before the subsequent upload operation.


326-328: LGTM!

Correctly placed between the upload and the subsequent push operation.


360-362: LGTM!

Mirrors the synchronization pattern from the first test block.


377-379: LGTM!

Completes the synchronization pattern for the second test scenario.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
tests/test_upstream_hub.py (1)

270-296: Consider broadening exception handling to cover more transient network errors.

The helper only catches HfHubHTTPError, but transient connection issues (e.g., ConnectionError, requests.exceptions.Timeout) would propagate and fail the test. Consider catching a broader set of transient errors:

-            except HfHubHTTPError:
+            except (HfHubHTTPError, ConnectionError) as e:
+                # Retry on transient errors
                 time.sleep(1)

Also, moving the import to the top of the file with other huggingface_hub imports would be more idiomatic:

-from huggingface_hub import DatasetCard, HfApi
+from huggingface_hub import DatasetCard, HfApi
+from huggingface_hub.errors import HfHubHTTPError
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3c23371 and e98efc8.

📒 Files selected for processing (1)
  • tests/test_upstream_hub.py (6 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
tests/test_upstream_hub.py (1)
tests/fixtures/hub.py (1)
  • temporary_repo (87-99)
🪛 Gitleaks (8.29.1)
tests/test_upstream_hub.py

[high] 358-358: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.

(generic-api-key)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (13)
  • GitHub Check: test (unit, windows-latest, deps-latest)
  • GitHub Check: test (integration, ubuntu-latest, deps-minimum)
  • GitHub Check: test (integration, windows-latest, deps-latest)
  • GitHub Check: test (integration, ubuntu-latest, deps-latest)
  • GitHub Check: test (unit, windows-latest, deps-minimum)
  • GitHub Check: test (integration, windows-latest, deps-minimum)
  • GitHub Check: test (unit, ubuntu-latest, deps-minimum)
  • GitHub Check: test (unit, ubuntu-latest, deps-latest)
  • GitHub Check: test_py314_future (unit, windows-latest, deps-latest)
  • GitHub Check: test_py314_future (unit, ubuntu-latest, deps-latest)
  • GitHub Check: test_py314 (unit, windows-latest, deps-latest)
  • GitHub Check: test_py314 (unit, ubuntu-latest, deps-latest)
  • GitHub Check: build / build_pr_documentation
🔇 Additional comments (4)
tests/test_upstream_hub.py (4)

306-311: LGTM!

Correctly placed synchronization point after the first push ensures Hub has processed LFS objects before the subsequent file upload.


326-329: LGTM!

Synchronization before the second push ensures the manually uploaded file is fully propagated.


357-361: LGTM - variable rename and synchronization points are correct.

The rename to ds_name_2 for the second test block improves clarity. The static analysis warning about "Generic API Key" on line 358 is a false positive—self._token references the test fixture constant CI_HUB_USER_TOKEN, not a hardcoded secret.


369-395: LGTM!

All API calls in the second test block consistently reference ds_name_2, and the wait synchronization is correctly placed before the second push operation.

`test_push_dataset_dict_to_hub_overwrite_files` intermittently fails with:
```
BadRequestError: LFS pointer pointed to a file that does not exist
```

Root cause: Two race conditions in the test design:
1. Rapid successive `push_to_hub` calls don't wait for Hub's LFS object
   propagation between pushes
2. Second test scenario reused the same repo name, creating a race between
   repo deletion and recreation

Fix:
- Add `_wait_for_repo_ready()` helper that ensures Hub repository is in a
  consistent state before subsequent operations
- Use unique repo name (`ds_name_2`) for second scenario to eliminate the
  delete/create race entirely

Tested: All 4 integration test variants now pass consistently (ubuntu/windows,
deps-latest/deps-minimum).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants