Parallel execution of scikit-learn integration tests broken due to non-deterministic test collection

With the merge of https://github.com/rapidsai/cuml/pull/6866, we introduced a patch to sklearn's `all_estimators` to prioritize proxy estimators.

Unfortunately, this change also introduced a source of non-determinism in the test collection order, which means that executing scikit-learn integration tests in parallel is broken.

While we don't run tests in parallel in CI, developers cannot reliably run tests in parallel for faster feedback

Example:

```console
./python/cuml/cuml_accel_tests/upstream/scikit-learn/run-tests.sh -n 4 -v
```

This will likely fail with an eror like `Different tests were collected between gw0 and gw2` error due to non-deterministic test ordering.

## Expected Behavior

Scikit-learn integration tests should execute deterministically in parallel without ordering issues.

## Current Behavior

The order in which tests are collected is currently non-deterministic, which leads to failures when attempting to run the test suite in parallel. In particular, the sequence of parameterized test instances for a given estimator can change from one run to another. For example, tests involving `LinearSVC(max_iter=20)` and `LinearSVC()` may be collected in different orders depending on the run. The underlying cause of this issue is not fully understood, but it appears to be connected to the way the `all_estimators` function is patched—possibly due to the patch not being applied consistently or completely.

## Expected Error Output

When running the command above, you'll see an error like:

```
ERROR collecting gw2
Different tests were collected between gw0 and gw2. The difference is:
--- gw0
+++ gw2
@@ -25704,6 +25704,70 @@
 tests/test_common.py::test_estimators[LinearRegression()-check_fit1d]
 tests/test_common.py::test_estimators[LinearRegression()-check_fit2d_predict1d]
 tests/test_common.py::test_estimators[LinearRegression()-check_requires_y_none]
+tests/test_common.py::test_estimators[LinearSVC()-check_estimator_cloneable0]
+tests/test_common.py::test_estimators[LinearSVC()-check_estimator_cloneable1]
+...
+tests/test_common.py::test_estimators[LinearSVC()-check_requires_y_none]
 tests/test_common.py::test_estimators[LinearSVC(max_iter=20)-check_estimator_cloneable0]
 tests/test_common.py::test_estimators[LinearSVC(max_iter=20)-check_estimator_cloneable1]
 ...
 tests/test_common.py::test_estimators[LinearSVC(max_iter=20)-check_requires_y_none]
-tests/test_common.py::test_estimators[LinearSVC()-check_estimator_cloneable0]
-tests/test_common.py::test_estimators[LinearSVC()-check_estimator_cloneable1]
-...
-tests/test_common.py::test_estimators[LinearSVC()-check_requires_y_none]
 tests/test_common.py::test_estimators[LinearSVR()-check_estimator_cloneable0]
```

The key issue is that `LinearSVC()` and `LinearSVC(max_iter=20)` appear in different orders between test workers, causing pytest-xdist to fail with "Different tests were collected" errors.

## Proposed Solutions

1. **Investigate the patch**: Review the changes in PR #6866 to understand what causes the non-determinism
2. **Fix ordering logic**: Ensure the proxy estimator prioritization maintains deterministic ordering
3. **Improve monkeypatching reliability**: The current patch may not be applied reliably, leading to inconsistent behavior
4. **Consider alternative approaches**: Instead of monkeypatching, consider other ways to handle duplicate estimator discovery


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parallel execution of scikit-learn integration tests broken due to non-deterministic test collection #7055

Expected Behavior

Current Behavior

Expected Error Output

Proposed Solutions

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Parallel execution of scikit-learn integration tests broken due to non-deterministic test collection #7055

Description

Expected Behavior

Current Behavior

Expected Error Output

Proposed Solutions

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions