You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
With the merge of #6866, we introduced a patch to sklearn's all_estimators to prioritize proxy estimators.
Unfortunately, this change also introduced a source of non-determinism in the test collection order, which means that executing scikit-learn integration tests in parallel is broken.
While we don't run tests in parallel in CI, developers cannot reliably run tests in parallel for faster feedback
This will likely fail with an eror like Different tests were collected between gw0 and gw2 error due to non-deterministic test ordering.
Expected Behavior
Scikit-learn integration tests should execute deterministically in parallel without ordering issues.
Current Behavior
The order in which tests are collected is currently non-deterministic, which leads to failures when attempting to run the test suite in parallel. In particular, the sequence of parameterized test instances for a given estimator can change from one run to another. For example, tests involving LinearSVC(max_iter=20) and LinearSVC() may be collected in different orders depending on the run. The underlying cause of this issue is not fully understood, but it appears to be connected to the way the all_estimators function is patched—possibly due to the patch not being applied consistently or completely.
Expected Error Output
When running the command above, you'll see an error like:
ERROR collecting gw2
Different tests were collected between gw0 and gw2. The difference is:
--- gw0
+++ gw2
@@ -25704,6 +25704,70 @@
tests/test_common.py::test_estimators[LinearRegression()-check_fit1d]
tests/test_common.py::test_estimators[LinearRegression()-check_fit2d_predict1d]
tests/test_common.py::test_estimators[LinearRegression()-check_requires_y_none]
+tests/test_common.py::test_estimators[LinearSVC()-check_estimator_cloneable0]
+tests/test_common.py::test_estimators[LinearSVC()-check_estimator_cloneable1]
+...
+tests/test_common.py::test_estimators[LinearSVC()-check_requires_y_none]
tests/test_common.py::test_estimators[LinearSVC(max_iter=20)-check_estimator_cloneable0]
tests/test_common.py::test_estimators[LinearSVC(max_iter=20)-check_estimator_cloneable1]
...
tests/test_common.py::test_estimators[LinearSVC(max_iter=20)-check_requires_y_none]
-tests/test_common.py::test_estimators[LinearSVC()-check_estimator_cloneable0]
-tests/test_common.py::test_estimators[LinearSVC()-check_estimator_cloneable1]
-...
-tests/test_common.py::test_estimators[LinearSVC()-check_requires_y_none]
tests/test_common.py::test_estimators[LinearSVR()-check_estimator_cloneable0]
The key issue is that LinearSVC() and LinearSVC(max_iter=20) appear in different orders between test workers, causing pytest-xdist to fail with "Different tests were collected" errors.
With the merge of #6866, we introduced a patch to sklearn's
all_estimatorsto prioritize proxy estimators.Unfortunately, this change also introduced a source of non-determinism in the test collection order, which means that executing scikit-learn integration tests in parallel is broken.
While we don't run tests in parallel in CI, developers cannot reliably run tests in parallel for faster feedback
Example:
./python/cuml/cuml_accel_tests/upstream/scikit-learn/run-tests.sh -n 4 -vThis will likely fail with an eror like
Different tests were collected between gw0 and gw2error due to non-deterministic test ordering.Expected Behavior
Scikit-learn integration tests should execute deterministically in parallel without ordering issues.
Current Behavior
The order in which tests are collected is currently non-deterministic, which leads to failures when attempting to run the test suite in parallel. In particular, the sequence of parameterized test instances for a given estimator can change from one run to another. For example, tests involving
LinearSVC(max_iter=20)andLinearSVC()may be collected in different orders depending on the run. The underlying cause of this issue is not fully understood, but it appears to be connected to the way theall_estimatorsfunction is patched—possibly due to the patch not being applied consistently or completely.Expected Error Output
When running the command above, you'll see an error like:
The key issue is that
LinearSVC()andLinearSVC(max_iter=20)appear in different orders between test workers, causing pytest-xdist to fail with "Different tests were collected" errors.Proposed Solutions
LinearSVCandLinearSVRincuml.accel#6866 to understand what causes the non-determinism