[enhancement] Enable array API for SVM algorithms #2209

icfaust · 2024-12-04T09:22:46Z

Description

This refactors and standardizes SVM algorithms to follow other sklearnex estimators and adds array API zero copy GPU support while reducing the code ~300-400 lines. This required the following changes:

Removed the SVMType object from onedal.svm which is not necessary for proper operation
The __init__ signature is standardized for all onedal SVM estimators, unused kwargs are removed for oneDAL calls before use
gamma keyword in onedal python estimator defaults to 'auto' rather than scale. It is implicitly expected that the user will calculate gamma before passing the value to the onedal python estimator
signature of fit in onedal SVM estimators add a class_count kwarg as oneDAL requires it to be defined beforehand. Calculating this in the onedal python estimator is scikit-learn conformance and is moved to the sklearnex estimator
All data validation is moved from the onedal estimator to the sklearnex estimator
support of csr_array is added (not just csr_matrix)
onedal SVM tests are completely rewritten to match the changes above, this includes moving some tests to the sklearnex estimator. No overall testing is lost. A refactor of the SVM testing should be done to see what is duplicating sklearn functionality and what utility the tests provide.
function get_sklearnex_version is removed as it is an unnecessary aliasing of daal_check_version
sklearnex.svm._common is renamed to sklearnex.svm._base to match scikit-learn
sklearnex.svm files containing sklearnex classes are centralized to sklearnex.svm._classes to minimize duplicated code and match scikit-learn
BaseSVM sklearnex object is greatly expanded to reduce code duplication and ease maintenance.
BaseSVC and BaseSVR classes are expanded to remove code duplication.
Function _svm_sample_weight_check replaces _get_sample_weight as _check_sample_weight central function in sklearnex.utils.validation is used instead. This function provides SVM-specific checks per class while maximally re-using available array API code in sklearnex. This should reduce maintenance
_compute_gamma_sigma is moved to sklearnex to match scikit-learn and is separated for easier maintenance
_onedal_cpu_supported and _onedal_gpu_supported use _n_jobs_supported_onedal_methods to define methods not including fit for oneDAL offloading checks. This reduces maintenance by making the n_jobs supporting list the single central location defining oneDAL supporting methods.
SVM method _validate_targets is defined locally with an array API-compliant version for Classification and Regression
Use of the _onedal_factory object allows for future easy SPMD support in the SVM algos, maximal code resuse, and follows precedence in the repository. This will minimize maintenance.
_save_attributes function now takes the xp array namespace to properly handle onedal data to sklearn data conversions
_onedal_ovr_decision_function is partially rewritten for array API support (fancy indexing issues make problems). Further performance optimization should be done given its nature
enable_array_api decorator used on SVM sklearnex estimators based on limitations in LabelEncoder accuracy_score and r2_score (> 1.5 sklearn)
_onedal_cpu_supported and _onedal_gpu_supported modified for array API support and for reusing maximal scikit-learn functionality for minimal maintenance
Certain sparse SVM tests deselections for 2025.0 removed
NuSVC-specific checks are rewritten in array API compliant manner. Likely needs a refactor to C++ code improve performance.
sklearnex design rule changed to allow continued support of target_offload for predict_proba/ probabilities
subtle change in sklearnex.utils.class_weight to guarantee numpy support when it supports __array_namespace__ but does not fully implement the array API standard (i.e. the device attribute).
SVM algos added to array_api.rst to show array API support in documentation
Removed negative weight support from SVM algos (missing check)
Added _onedal_validate_targets function which replicated sklearn's _validate_targets but with array API support and converts data to match X data dtype

PR should start as a draft, then move to ready for review state after CI is passed and all applicable checkboxes are closed.
This approach ensures that reviewers don't spend extra time asking for regular requirements.

You can remove a checkbox as not applicable only if it doesn't relate to this PR in any way.
For example, PR with docs update doesn't require checkboxes for performance while PR with any change in actual code should have checkboxes and justify how this code change is expected to affect performance (or justification should be self-evident).

Checklist to comply with before moving PR from draft:

PR completeness and readability

I have reviewed my changes thoroughly before submitting this pull request.
I have commented my code, particularly in hard-to-understand areas.
I have updated the documentation to reflect the changes or created a separate PR with update and provided its number in the description, if necessary.
Git commit message contains an appropriate signed-off-by string (see CONTRIBUTING.md for details).
I have added a respective label(s) to PR if I have a permission for that.
I have resolved any merge conflicts that might occur with the base branch.

Testing

I have run it locally and tested the changes extensively.
All CI jobs are green or I have provided justification why they aren't.
I have extended testing suite if new functionality was introduced in this PR.

icfaust · 2024-12-04T23:18:46Z

/intelci: run

icfaust · 2024-12-05T12:39:08Z

/intelci: run

david-cortes-intel · 2025-11-18T16:29:31Z

sklearnex/svm/_base.py

+            sample_weight = _check_sample_weight(sample_weight, X)
+            # oneDAL only accepts sample_weights, apply class_weight directly
+
+        # due to the nature of how sklearn checks nu in NuSVC (by not checking


Does this perhaps have some misplaced parenthesis? Or is it perhaps missing some sentence?

I think nothing is missing.
Those are two different sentences that describe the code above (the first one) and below (the second one).

david-cortes-intel · 2025-11-18T16:39:22Z

The CI issues:

E AssertionError: SVC failed when fitted on one label after sample_weight trimming. Error message is not explicit, it should have 'class'.

Looks like it could be solved by adding an extra check for single-class data in the patching conditions.

For the other issue:

WARNING: The candidate selected for download or install is a yanked version

It should be solvable by merging the latest master.

onedal/svm/svm.py

sklearnex/svm/_base.py

sklearnex/svm/_classes.py

david-cortes-intel · 2025-11-24T16:48:21Z

Looks like these changes will be required for sklearn1.8, as otherwise conformance tests throw errors about 'xp' argument in some methods.

…intelex into dev/new_SVM

Vika-F · 2025-11-26T12:13:22Z

/intelci: run

david-cortes-intel · 2025-11-27T08:41:18Z

There will be some changes required for sklearn1.8 that generate merge conflicts with this PR:
#2801

Perhaps they could be all incorporated here instead if it makes the merging easier.

Vika-F · 2025-11-27T08:57:41Z

There will be some changes required for sklearn1.8 that generate merge conflicts with this PR: #2801

Perhaps they could be all incorporated here instead if it makes the merging easier.

@david-cortes-intel Ok, I will do that. Anyway I will be fixing pre-commit issues here.

Vika-F · 2025-11-27T12:21:05Z

/intelci: run

Vika-F · 2025-11-27T14:00:41Z

/intelci: run

sklearnex/svm/_classes.py

david-cortes-intel · 2025-11-28T08:11:35Z

/intelci: run

david-cortes-intel · 2025-11-28T09:30:16Z

/intelci: run

icfaust added 17 commits December 4, 2024 09:55

first steps

1ebc83c

remove SVMType entirely

6ba816c

movement?

b2f1bd7

centralize predict

66cac25

further removal

1b3e266

further removal

8686d56

attempt to deal with other oddities

01ac923

move onedal_decision_function

d6a5c6e

centralization

9301fbb

swap score information

66dc249

further fixes

2567022

try to remove method resolution problem temporarily

94b7452

make modifications to tests

fd11f04

ravels

2ceccf7

fix errors, reduce code, move decision_function

806f372

formatting

6d863fb

last fixes of the day

32e8754

icfaust and others added 5 commits December 5, 2024 06:37

Update test_nusvc.py

e2aabde

Update _common.py

02dce05

make similar to previous

7cee399

further modification to try and fix sklearn conformance

a0f941e

forgot about regression

b29c1c6

icfaust mentioned this pull request Dec 5, 2024

[enhancement] add sklearnex version of validate_data, _check_sample_weight #2177

Merged

13 tasks

icfaust added 2 commits December 5, 2024 12:51

fix mro issue

45ee05d

refactoring

470a8c9

icfaust added 3 commits December 5, 2024 21:48

remove vestigial code

5bed19b

move initial tests from onedal to sklearnex

39b4bc1

remove vestigial code

7448ad4

david-cortes-intel reviewed Nov 18, 2025

View reviewed changes

david-cortes-intel reviewed Nov 19, 2025

View reviewed changes

onedal/svm/svm.py Outdated Show resolved Hide resolved

sklearnex/svm/_base.py Outdated Show resolved Hide resolved

sklearnex/svm/_classes.py Outdated Show resolved Hide resolved

Vika-F mentioned this pull request Nov 25, 2025

[enhancement] Enable array API for SVM algorithms (Copy) #2797

Closed

10 tasks

david-cortes-intel mentioned this pull request Nov 26, 2025

[Do NOT merge yet] MAINT: Apply sklearn1.8 deprecation of probability in SVM classifiers #2799

Closed

3 tasks

Vika-F added 4 commits November 26, 2025 03:42

formatting

30df786

wording fixes

fb1444d

Remove the class_weight support on GPU to fix conformance tests

4751626

Merge branch 'main' of https://github.com/uxlfoundation/scikit-learn-…

22748c8

…intelex into dev/new_SVM

david-cortes-intel mentioned this pull request Nov 27, 2025

MAINT: Pass namespace argument to fit_calibrator in sklearn1.8 #2801

Merged

4 tasks

Vika-F added 2 commits November 27, 2025 01:04

Pass namespace argument to fit_calibrator in sklearn 1.8

8ec0d89

Fix check_classifiers_one_label_sample_weights test failure

8887346

Vika-F added 2 commits November 27, 2025 05:01

Fix

ca4f7d5

Fix

d2df170

david-cortes-intel reviewed Nov 27, 2025

View reviewed changes

sklearnex/svm/_classes.py Outdated Show resolved Hide resolved

icfaust and others added 6 commits November 27, 2025 23:52

Update _classes.py

5396c01

Update _base.py

c533e1f

Update _base.py

669813a

use array-api-friendly sqrt, inherit from ABC

947802a

Merge branch 'main' into dev/new_SVM

09398bc

fix CI errors

6b88da6

fix for old sklearn test

08ea0be

[enhancement] Enable array API for SVM algorithms #2209

Are you sure you want to change the base?

[enhancement] Enable array API for SVM algorithms #2209

Uh oh!

Conversation

icfaust commented Dec 4, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Uh oh!

icfaust commented Dec 4, 2024

Uh oh!

icfaust commented Dec 5, 2024

Uh oh!

david-cortes-intel Nov 18, 2025

Choose a reason for hiding this comment

Uh oh!

Vika-F Nov 26, 2025

Choose a reason for hiding this comment

Uh oh!

david-cortes-intel commented Nov 18, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

david-cortes-intel commented Nov 24, 2025

Uh oh!

Vika-F commented Nov 26, 2025

Uh oh!

david-cortes-intel commented Nov 27, 2025

Uh oh!

Vika-F commented Nov 27, 2025

Uh oh!

Vika-F commented Nov 27, 2025

Uh oh!

Vika-F commented Nov 27, 2025

Uh oh!

Uh oh!

david-cortes-intel commented Nov 28, 2025

Uh oh!

david-cortes-intel commented Nov 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

icfaust commented Dec 4, 2024 •

edited

Loading