Use RBC from cuVS by divyegala · Pull Request #6644 · rapidsai/cuml

divyegala · 2025-05-07T22:35:55Z

Depends on rapidsai/cuvs#218. This PR reduces the supported combination of types for RBC method in dbscan.cu to only <float, int64_t>. This is because this is the only type combination that cuVS compiles RBC for, which is otherwise very expensive and slow to compile.

Effects on Binary Size

Tracked here #6626 (comment)

csadorf

Overall looks good, but I have a few questions and requests.

csadorf · 2025-05-09T20:20:39Z

 find_and_configure_cuvs(VERSION          ${CUML_MIN_VERSION_cuvs}
      FORK             rapidsai
-      PINNED_TAG       branch-${CUML_BRANCH_VERSION_cuvs}
+      PINNED_TAG       fea-2408-rbc


Can we either block this PR or create an issue to track unpinning, please?

Your blocking review is fine, I will let you know when I unpin.

Unpinned 9192e3d

csadorf · 2025-05-09T20:22:37Z

+    if algorithm == "rbc":
+        if datatype == np.float64 or out_dtype in ["int32", np.int32]:
+            pytest.skip("RBC does not support float64 dtype or int32 labels")


⚠️ Is that a new limitation? If so then this is a breaking change.

Yes, fair enough. Changed the labels.

csadorf · 2025-05-09T20:25:00Z

 from libcpp cimport bool
 from libcpp.vector cimport vector
 from pylibraft.common.handle cimport handle_t
+from pylibraft.common.mdspan cimport *


I'm aware that this a common pattern in this codebase, but should we try to avoid wildcard imports in the future?

Can you explain why we need to avoid them? Happy to not do this, just want to know for my own knowledge.

Quoting the "imports" section from PEP 8:

Wildcard imports (from import *) should be avoided, as they make it unclear which names are present in the namespace, confusing both readers and many automated tools. [...]

It obfuscates what's actually present in the namespace which makes it harder to understand what exact interface is exposed through the module and what a symbol's provenance is. This might be a Cython pattern that I am not aware of, but for general Python code this is an indisputed anti-pattern.

I think it's less bad and not necessarily an antipattern for cimports. It's not uncommon in cython codebases to have large headers and include them with cimport * (using cimport * is basically the same as a C include). pyarrow does this a bunch, for example.

Qualified imports make it easier to understand what's being pulled in, and also lets linters check when a cimport is no longer needed (I just removed a bunch of unnecessary ones in #6600, for example). I wouldn't block on adding a cimport *, but if the number of included symbols is small, I also think it'd be nicer to spell them out explicitly.

@jcrist Thanks for the perspective. Definitely not blocking for this PR. Just wanted to get a take on this.

using cimport * is basically the same as a C include

This is the equivalence I was applying. But I removed the wildcard imports 2c30ad1, thanks both!

jcrist

Just a few questions, most of this looks like a straightforward port.

jcrist · 2025-05-09T20:23:56Z

 ):
+    if algorithm == "rbc":
+        if datatype == np.float64 or out_dtype in ["int32", np.int32]:
+            pytest.skip("RBC does not support float64 dtype or int32 labels")


Is this a reduction in support (and if so, why? the linked PR looked like it just moved what was in raft to cuvs, I'd expect support to remain the same)? Or did this not work before (and would fallback)?

Also, what happens if you run DBSCAN with these params and dtypes? I see something in the c++ layer to log a warning and fallback - is that what's hit here?

Is this a reduction in support (and if so, why? the linked PR looked like it just moved what was in raft to cuvs, I'd expect support to remain the same)? Or did this not work before (and would fallback)?

This is a reduction in support, yes. @csadorf also pointed it out here #6644 (comment). The reason why RAFT supported it but cuVS does not is because RAFT was header-only so we could compile for all the types we want, whereas cuVS pre-compiles these types for us. cuVS offers only float support.

Personally, I am fine with us not asking cuVS to provide double support because RBC is extremely expensive to compile. Every unique combination of type adds 20 MB in binary size.

Also, what happens if you run DBSCAN with these params and dtypes? I see something in the c++ layer to log a warning and fallback - is that what's hit here?

Yes, the C++ layer logs a warning and provides a fallback. But in the tests we don't want to hit the fallback as the fallback is already tested as part of the param combinations.

Personally I'm fine with dropping this and if the algorithm still runs without it (and warns) then I don't think this is breaking enough to be worthy of a deprecation cycle.

But in the tests we don't want to hit the fallback as the fallback is already tested as part of the param combination

That said, I do think it's worth testing that the fallback actually fallsback. Are you saying that the fallback (RBC w/ these datatypes) is run elsewhere and we see the fallback is hit there? Or only that the other algorithm is tested elsewhere? If the latter, then I think we'll want to ensure the former is tested somewhere.

Also FWIW, unless this test fails due to numeric differences or takes a ton of time, I don't see value in skipping it here personally.

I see your point, it is the latter. The tests are quick and not really flaky. I'll remove the skip.

I "un"skipped the fallback test in our C++ tests because they are faster to run and generally more stable. 5214539

csadorf · 2025-05-09T20:39:31Z

This is because this is the only type combination that cuVS compiles RBC for, which is otherwise very expensive and slow to compile.

What's the expected user impact? Should this go through a deprecation cycle? Is this still an issue if we statically link to cuVS?

divyegala · 2025-05-09T20:47:39Z

What's the expected user impact?

Nothing apparent, we have a fallback available. While RBC is definitely an optimization to BRUTE_FORCE strategy, it is just too expensive for us to compile and support 4 different type combinations for. I think just float is good enough for almost all use cases. Some context on RBC: it is more strongly a low dimensional optimization and is most-performant for 2 or 3 columns in data matrix.

Should this go through a deprecation cycle?

This is a good question. If you feel strongly about it, then we can. It will most likely delay our PyPI plans though.

Is this still an issue if we statically link to cuVS?

Yes, static or dynamic link does not matter. If cuVS does not provide the type support we can't use it.

csadorf · 2025-05-09T21:01:34Z

What's the expected user impact?

Nothing apparent, we have a fallback available. While RBC is definitely an optimization to BRUTE_FORCE strategy, it is just too expensive for us to compile and support 4 different type combinations for. I think just float is good enough for almost all use cases. Some context on RBC: it is more strongly a low dimensional optimization and is most-performant for 2 or 3 columns in data matrix.

But here we are actually limiting the datatype, not just the index type, are we?

Should this go through a deprecation cycle?

This is a good question. If you feel strongly about it, then we can. It will most likely delay our PyPI plans though.

We need to understand the user impact to be able to weigh that decision. Having a fallback to a less performant method is insufficient mitigation IMO. I would assume that dropping support for float64 datatypes has more than just marginal impact.

@cjnolet I would be interested in your take on this, too. Can we safely assume that most DBSCAN users would either prefer or be fine with working with single-precision datasets?

divyegala · 2025-05-13T05:01:46Z

@viclafargue can you review this PR?

csadorf

I'm approving, because I am convinced that removing double precision support will have very limited, albeit non-zero impact on users.

That said, for changes of this nature in the future, I would recommend more upfront communication about new limitations and providing users with adequate time to adapt through a deprecation cycle.

While I have some concerns about the implementation process, I understand that this change is necessary to advance cuVS adoption.

cjnolet · 2025-05-13T20:45:04Z

I think just float is good enough for almost all use cases. Some context on RBC: it is more strongly a low dimensional optimization and is most-performant for 2 or 3 columns in data matrix.

Sorry for being late to this discussion @csadorf and @divyegala. No doubt, the decision to go from double + float support to just float is going to have a non-zero impact, but the longer we go and the more we're faced with these expensive decisions about hosting device code for multiple formats, the more I'm thjinking we should startt moving towards supporting only float across most, if not all, of our algorithms.

The expense of supporting double out of the box is much greater than users understanding that users can normalize and/or divide their vectors in the (tiny) chance they ACTUALLY need double precision.
One reason we opted to support both double and float from the start was that we wanted to be as user friendly as possible,
but there's another hidden reason here that we supported it because it can be more accurate in the face of certain computations- such as gradients in solvers and distances which can sometimes eat up the excess precision available in floats.
However in this latter case, I think the proper way to handle this is to promote to double during those computations and use float everywhere else. Hopefully we will start moving in this direction.

@csadorf I agree, it's unfortunate this came up as a rather last-minute fix/workaround, and I definitely agree we should have moe discussions about how we are going to migrate longer term.

dantegd

After many years developing our algorithms, I agree with @cjnolet analysis and points.

For changing a full algorithm support (say whether RF supports float64 or not) we would definitely need a deprecation cycle and more discussion and analysis, but the impact here is even smaller than that would be and seems like an acceptable choice to me.

divyegala · 2025-05-14T14:32:34Z

/merge

viclafargue

Apologies for the delayed review, @divyegala. I've gone through it, and everything looks good to me.

divyegala added 2 commits May 7, 2025 22:33

use cuvs rbc in knn

0c80325

merge upstream

4044fd0

divyegala self-assigned this May 7, 2025

divyegala requested a review from a team as a code owner May 7, 2025 22:35

divyegala added the improvement Improvement / enhancement to an existing function label May 7, 2025

divyegala requested review from a team as code owners May 7, 2025 22:35

divyegala added the non-breaking Non-breaking change label May 7, 2025

divyegala requested review from bdice, cjnolet and robertmaynard May 7, 2025 22:35

github-actions Bot added Cython / Python Cython or Python issue CMake CUDA/C++ labels May 7, 2025

divyegala mentioned this pull request May 7, 2025

[TRACKER] Use cuVS instead of RAFT #6521

Closed

5 tasks

divyegala changed the title ~~Use RBC from cuVS~~ [DO NOT MERGE] Use RBC from cuVS May 7, 2025

divyegala added 4 commits May 8, 2025 19:11

passing knn tests

0009646

merge upstream

cffe83e

passing dbscan tests

251bd3f

merge upstream

98ff051

divyegala mentioned this pull request May 9, 2025

[TRACKER] Binary Size Analysis #6626

Closed

divyegala changed the title ~~[DO NOT MERGE] Use RBC from cuVS~~ Use RBC from cuVS May 9, 2025

csadorf requested changes May 9, 2025

View reviewed changes

jcrist reviewed May 9, 2025

View reviewed changes

divyegala added breaking Breaking change and removed non-breaking Non-breaking change labels May 9, 2025

divyegala added 2 commits May 9, 2025 20:59

remove skip to test fallback

5214539

revert cuvs pin

9192e3d

merge upstream

a90c6c5

github-actions Bot removed the CMake label May 9, 2025

divyegala added 2 commits May 9, 2025 21:20

remove wild-card imports

2c30ad1

missing import

24def44

divyegala removed request for a team, bdice and robertmaynard May 9, 2025 21:26

divyegala added 3 commits May 12, 2025 23:59

remove cuvs header and mdspan stuff

df2bbb7

Merge Upstream

69967bf

Merge branch 'branch-25.06' into cuvs-rbc

3622def

divyegala requested review from csadorf and jcrist May 13, 2025 03:00

csadorf approved these changes May 13, 2025

View reviewed changes

dantegd approved these changes May 13, 2025

View reviewed changes

Comment thread cpp/src/dbscan/dbscan.cuh Outdated

divyegala added 3 commits May 13, 2025 23:56

warn to info

2d48614

Merge Upstream

1962f92

merge origin

e44eaa8

rapids-bot Bot merged commit a22a259 into rapidsai:branch-25.06 May 14, 2025
92 of 93 checks passed

viclafargue reviewed May 14, 2025

View reviewed changes

divyegala linked an issue May 14, 2025 that may be closed by this pull request

Reduce object sizes of dbscan and knn #6730

Closed

jcrist mentioned this pull request May 21, 2025

Support SVC and SVR in cuml.accel #6778

Merged

Conversation

divyegala commented May 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Effects on Binary Size

Uh oh!

csadorf left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

csadorf May 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jcrist left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

csadorf commented May 9, 2025

Uh oh!

divyegala commented May 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

csadorf commented May 9, 2025

Uh oh!

divyegala commented May 13, 2025

Uh oh!

csadorf left a comment

Choose a reason for hiding this comment

Uh oh!

cjnolet commented May 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dantegd left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

divyegala commented May 14, 2025

Uh oh!

Uh oh!

viclafargue left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

divyegala commented May 7, 2025 •

edited

Loading

csadorf May 9, 2025 •

edited

Loading

divyegala commented May 9, 2025 •

edited

Loading

cjnolet commented May 13, 2025 •

edited

Loading