Improve cuML's sklearn compatibility through systematic testing and validation of estimators.
Impact
- Better integration with sklearn ecosystem and meta-estimators
- Reduced limitations for cuml.accel
- Prevention of compatibility regressions
- Improved documentation of compatibility status
Objectives
O1: Establish Testing Framework
Establish check_estimator testing for all cuML estimators, even with expected failures.
O2: Address High-Impact Issues
O3: Prevent Regressions
Non-Goals
- Perfect compatibility: Only address high-impact issues
- Immediate resolution of all gaps: Defer low-impact issues
- Testing all sklearn versions: Single version testing is sufficient
- Exhaustive documentation: Issue-based documentation is sufficient for most cases
Timeline
- 25.10: Establish testing infrastructure and address high-impact issues
- Ongoing: Address high- and medium-impact issues
Risks and Mitigations
- Scope creep: Prioritize issues rather than fixing everything
- Maintenance burden: Keep test infrastructure lean
- False positives: Evaluate impact before adjusting API/implementation
Success Criteria
Improve cuML's sklearn compatibility through systematic testing and validation of estimators.
Impact
Objectives
O1: Establish Testing Framework
Establish
check_estimatortesting for all cuML estimators, even with expected failures.check_estimatorand other standard sklearn compatibility checks #6432O2: Address High-Impact Issues
O3: Prevent Regressions
check_estimatortests opt-out rather than opt-inNon-Goals
Timeline
Risks and Mitigations
Success Criteria
check_estimator