Remove reliance on remote datasets in tests#7637
Merged
rapids-bot[bot] merged 12 commits intorapidsai:mainfrom Dec 31, 2025
Merged
Remove reliance on remote datasets in tests#7637rapids-bot[bot] merged 12 commits intorapidsai:mainfrom
rapids-bot[bot] merged 12 commits intorapidsai:mainfrom
Conversation
It is deprecated and should have been replaced anyways. In general, we do not want to rely on remote datasets. Closes rapidsai#5158
The generator functions mimick the originally used datasets, including the California housing dataset and the 20 newsgroups dataset. In addition, we update some precision assers to work with the new datasets.
d91469e to
bf00358
Compare
bdice
approved these changes
Dec 31, 2025
Similar to the nlp_20news fixture function.
Contributor
Author
|
I am addressing the remaining dask test failures. |
The fixture functions remain as thin wrappers.
It was a very thin wrapper of the make_regression synthetic dataset generation function and used in only one place where it actually failed with an xfail marker.
To better reflect the actual fixture content and avoid misconceptions.
This was referenced Dec 31, 2025
Contributor
Author
|
/merge |
1 similar comment
Contributor
Author
|
/merge |
This was referenced Jan 2, 2026
rapids-bot Bot
pushed a commit
that referenced
this pull request
Jan 2, 2026
To not rely on remote datasets. Closes #7643 . Follow-up to #7637 . Authors: - Simon Adorf (https://github.com/csadorf) Approvers: - James Lamb (https://github.com/jameslamb) URL: #7644
rapids-bot Bot
pushed a commit
that referenced
this pull request
Jan 5, 2026
#7637 removed the last uses of `tenacity` here. This PR removes that dependency from test environments, including the `[test]` extra for `cuml` wheels. Authors: - James Lamb (https://github.com/jameslamb) Approvers: - Simon Adorf (https://github.com/csadorf) - Bradley Dice (https://github.com/bdice) URL: #7645
mani-builds
pushed a commit
to mani-builds/cuml
that referenced
this pull request
Jan 11, 2026
Replace fetched datasets with synthetic generated data to make tests more robust and eliminate network dependencies. Closes rapidsai#3161 ; Closes rapidsai#5158; Closes rapidsai#6558; Closes rapidsai#7639 Authors: - Simon Adorf (https://github.com/csadorf) Approvers: - Bradley Dice (https://github.com/bdice) URL: rapidsai#7637
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Replace fetched datasets with synthetic generated data to make tests more robust and eliminate network dependencies.
Closes #3161 ; Closes #5158; Closes #6558; Closes #7639