Use RBC from cuVS#6644
Conversation
csadorf
left a comment
There was a problem hiding this comment.
Overall looks good, but I have a few questions and requests.
| find_and_configure_cuvs(VERSION ${CUML_MIN_VERSION_cuvs} | ||
| FORK rapidsai | ||
| PINNED_TAG branch-${CUML_BRANCH_VERSION_cuvs} | ||
| PINNED_TAG fea-2408-rbc |
There was a problem hiding this comment.
Can we either block this PR or create an issue to track unpinning, please?
There was a problem hiding this comment.
Your blocking review is fine, I will let you know when I unpin.
| if algorithm == "rbc": | ||
| if datatype == np.float64 or out_dtype in ["int32", np.int32]: | ||
| pytest.skip("RBC does not support float64 dtype or int32 labels") |
There was a problem hiding this comment.
There was a problem hiding this comment.
Yes, fair enough. Changed the labels.
| from libcpp cimport bool | ||
| from libcpp.vector cimport vector | ||
| from pylibraft.common.handle cimport handle_t | ||
| from pylibraft.common.mdspan cimport * |
There was a problem hiding this comment.
I'm aware that this a common pattern in this codebase, but should we try to avoid wildcard imports in the future?
There was a problem hiding this comment.
Can you explain why we need to avoid them? Happy to not do this, just want to know for my own knowledge.
There was a problem hiding this comment.
Quoting the "imports" section from PEP 8:
Wildcard imports (from import *) should be avoided, as they make it unclear which names are present in the namespace, confusing both readers and many automated tools. [...]
It obfuscates what's actually present in the namespace which makes it harder to understand what exact interface is exposed through the module and what a symbol's provenance is. This might be a Cython pattern that I am not aware of, but for general Python code this is an indisputed anti-pattern.
There was a problem hiding this comment.
I think it's less bad and not necessarily an antipattern for cimports. It's not uncommon in cython codebases to have large headers and include them with cimport * (using cimport * is basically the same as a C include). pyarrow does this a bunch, for example.
Qualified imports make it easier to understand what's being pulled in, and also lets linters check when a cimport is no longer needed (I just removed a bunch of unnecessary ones in #6600, for example). I wouldn't block on adding a cimport *, but if the number of included symbols is small, I also think it'd be nicer to spell them out explicitly.
There was a problem hiding this comment.
@jcrist Thanks for the perspective. Definitely not blocking for this PR. Just wanted to get a take on this.
There was a problem hiding this comment.
using cimport * is basically the same as a C include
This is the equivalence I was applying. But I removed the wildcard imports 2c30ad1, thanks both!
jcrist
left a comment
There was a problem hiding this comment.
Just a few questions, most of this looks like a straightforward port.
| ): | ||
| if algorithm == "rbc": | ||
| if datatype == np.float64 or out_dtype in ["int32", np.int32]: | ||
| pytest.skip("RBC does not support float64 dtype or int32 labels") |
There was a problem hiding this comment.
Is this a reduction in support (and if so, why? the linked PR looked like it just moved what was in raft to cuvs, I'd expect support to remain the same)? Or did this not work before (and would fallback)?
There was a problem hiding this comment.
Also, what happens if you run DBSCAN with these params and dtypes? I see something in the c++ layer to log a warning and fallback - is that what's hit here?
There was a problem hiding this comment.
Is this a reduction in support (and if so, why? the linked PR looked like it just moved what was in raft to cuvs, I'd expect support to remain the same)? Or did this not work before (and would fallback)?
This is a reduction in support, yes. @csadorf also pointed it out here #6644 (comment). The reason why RAFT supported it but cuVS does not is because RAFT was header-only so we could compile for all the types we want, whereas cuVS pre-compiles these types for us. cuVS offers only float support.
Personally, I am fine with us not asking cuVS to provide double support because RBC is extremely expensive to compile. Every unique combination of type adds 20 MB in binary size.
Also, what happens if you run DBSCAN with these params and dtypes? I see something in the c++ layer to log a warning and fallback - is that what's hit here?
Yes, the C++ layer logs a warning and provides a fallback. But in the tests we don't want to hit the fallback as the fallback is already tested as part of the param combinations.
There was a problem hiding this comment.
Personally I'm fine with dropping this and if the algorithm still runs without it (and warns) then I don't think this is breaking enough to be worthy of a deprecation cycle.
But in the tests we don't want to hit the fallback as the fallback is already tested as part of the param combination
That said, I do think it's worth testing that the fallback actually fallsback. Are you saying that the fallback (RBC w/ these datatypes) is run elsewhere and we see the fallback is hit there? Or only that the other algorithm is tested elsewhere? If the latter, then I think we'll want to ensure the former is tested somewhere.
Also FWIW, unless this test fails due to numeric differences or takes a ton of time, I don't see value in skipping it here personally.
There was a problem hiding this comment.
I see your point, it is the latter. The tests are quick and not really flaky. I'll remove the skip.
There was a problem hiding this comment.
I "un"skipped the fallback test in our C++ tests because they are faster to run and generally more stable. 5214539
What's the expected user impact? Should this go through a deprecation cycle? Is this still an issue if we statically link to cuVS? |
Nothing apparent, we have a fallback available. While
This is a good question. If you feel strongly about it, then we can. It will most likely delay our PyPI plans though.
Yes, static or dynamic link does not matter. If cuVS does not provide the type support we can't use it. |
But here we are actually limiting the datatype, not just the index type, are we?
We need to understand the user impact to be able to weigh that decision. Having a fallback to a less performant method is insufficient mitigation IMO. I would assume that dropping support for float64 datatypes has more than just marginal impact. @cjnolet I would be interested in your take on this, too. Can we safely assume that most DBSCAN users would either prefer or be fine with working with single-precision datasets? |
|
@viclafargue can you review this PR? |
csadorf
left a comment
There was a problem hiding this comment.
I'm approving, because I am convinced that removing double precision support will have very limited, albeit non-zero impact on users.
That said, for changes of this nature in the future, I would recommend more upfront communication about new limitations and providing users with adequate time to adapt through a deprecation cycle.
While I have some concerns about the implementation process, I understand that this change is necessary to advance cuVS adoption.
Sorry for being late to this discussion @csadorf and @divyegala. No doubt, the decision to go from double + float support to just float is going to have a non-zero impact, but the longer we go and the more we're faced with these expensive decisions about hosting device code for multiple formats, the more I'm thjinking we should startt moving towards supporting only float across most, if not all, of our algorithms.
@csadorf I agree, it's unfortunate this came up as a rather last-minute fix/workaround, and I definitely agree we should have moe discussions about how we are going to migrate longer term. |
dantegd
left a comment
There was a problem hiding this comment.
After many years developing our algorithms, I agree with @cjnolet analysis and points.
For changing a full algorithm support (say whether RF supports float64 or not) we would definitely need a deprecation cycle and more discussion and analysis, but the impact here is even smaller than that would be and seems like an acceptable choice to me.
|
/merge |
viclafargue
left a comment
There was a problem hiding this comment.
Apologies for the delayed review, @divyegala. I've gone through it, and everything looks good to me.
Depends on rapidsai/cuvs#218. This PR reduces the supported combination of types for
RBCmethod indbscan.cuto only<float, int64_t>. This is because this is the only type combination that cuVS compiles RBC for, which is otherwise very expensive and slow to compile.Effects on Binary Size
Tracked here #6626 (comment)