All neighbors C and Python bindings#1282
All neighbors C and Python bindings#1282rapids-bot[bot] merged 20 commits intorapidsai:branch-25.10from
Conversation
jinsolp
left a comment
There was a problem hiding this comment.
Thanks for this PR @viclafargue ! Adding a few comments 🙂
| def __dealloc__(self): | ||
| check_cuvs(cuvsResourcesDestroy(self.c_obj)) | ||
|
|
||
| cdef class SNMGResources: |
There was a problem hiding this comment.
I didn't know we had a separate resource wrapper in cuVS. If there is no explicit reason to not import raft, I think we can just use pylibraft.common.DeviceResourcesSNMG (used like this in cuML).
I think we should be able to have cuvsResources_t to be the arg type in the c api which will dynamically dispatch to the multi gpu if the user passes DeviceResourcesSNMG as the handle.
There was a problem hiding this comment.
I removed the cuvsSNMGResources_t type to simplify the API. But, we need the user to be able to instantiate a multi-GPU resource from C. This also enables other bindings. When it comes to handling RAFT handles, this is definitely something we would want to do, but I feel like this may be out of the scope of this PR. Implementing it only in All Neighbors would be inconsistent with the rest of the cuVS Python API. Maybe others would have different insights though. cc @cjnolet
There was a problem hiding this comment.
Nitpick- can we please rename deviceSNMG* to DeviceMultiGpu all the way down to the C++ layer? I'm realizing this naming is becoming confusing to users.
Any C/C++ public APIs should absolutely not leak RAFT types. This is actually for more reasons than may be immediately obvious- it turns out the way symbol lookups are resolved in binaries are non-deterministic, and so they can causes nasty segfaults when, say, a symbol is looked up in cugraph or cuml, instantiated, and then later looked up in cuVS (where the instantiated symbol resolves to an uninstantiated one, thus causing a memory illegal access).
I think it's okay to use RAFT types in the Python layer only, so long as the symbol is being exposed through pylibraft's cython/python layer and not being wrapped through cuVS' cython.
| @@ -0,0 +1,71 @@ | |||
| All-neighbors | |||
There was a problem hiding this comment.
I love the docs, but all-neighbros isn't really an index type, it's more-so a way to take a series of dense vectors and construct a knn graph from them (it's really a way to convert dense vectors into a corresponding knn graph when you think about it).
That said, I think this documentation is immensely useful, I'm just proposing we put it in a different spot in the dots (that better reflects its purpose) or rename the "indexes" directory/rst file to be something else- like neighbors.
There was a problem hiding this comment.
What would you like the exact change to be regarding documentation?
There was a problem hiding this comment.
I put the cpp doc here so maybe this should go under docs/source/c_api
There was a problem hiding this comment.
First change I would make is to not put it in the "indexes" docs (`docs/source/indexes/all_neighbors.rst).
There was a problem hiding this comment.
The all_neighbors.rst does not document the C API though. In the end, what made the most sense was to rename the indexes directory to neighbors and Nearest Neighbor Indexes to Nearest Neighbor. I also updated the links.
| return wrapper | ||
|
|
||
|
|
||
| def auto_sync_snmg_resources(f): |
There was a problem hiding this comment.
The more I think about this, the more I realize we should just be using pylibraft for this instead of wrapping a new type just in cuvs. That would allow cuml and others to benefit from this as well.
There was a problem hiding this comment.
I agree that it would be neat to allow the use of RAFT handles. But, I think that this would have to be a dedicated follow-up work across the full API. Otherwise, it would need a lot of rework and would make the cuVS API inconsistent.
There was a problem hiding this comment.
@viclafargue if you haven't done so already, can you please create an issue for this so it doesn't get lost in the noise of all other things going on?
| from cuvs.common.exceptions import check_cuvs | ||
|
|
||
|
|
||
| cdef class MultiGpuResources: |
There was a problem hiding this comment.
I haven't checked yet, but do we have a Github issue for this one? Can you please reference it in the TODO here so that we don't lose sight of it? Having a DeviceResources from raft and MultiGpuResources is not consistent and it's ultimately going to affect the user experience. We should make that fix asap (though we don't need to hold up this PR).
There was a problem hiding this comment.
I think Victor left an issue here. Please correct me if this isn't the right one @viclafargue
jinsolp
left a comment
There was a problem hiding this comment.
Thanks for the work Victor! Left a few comments : )
| @@ -0,0 +1,71 @@ | |||
| All-neighbors | |||
There was a problem hiding this comment.
I put the cpp doc here so maybe this should go under docs/source/c_api
| if snmg: | ||
| assert recall > 0.85 |
There was a problem hiding this comment.
Do we experience recall drops with snmg?
There was a problem hiding this comment.
The recall for multi-GPU IVF-PQ was a bit low. I updated the n_lists parameters for the multi-GPU case. I set a unique threshold at 0.83 to be resilient to minor testing inconsistencies.
There was a problem hiding this comment.
Got it. yeah ivfpq seemed to be very sensitive to n_lists.
jinsolp
left a comment
There was a problem hiding this comment.
Looking great @viclafargue ! A few final minor comments about the docs 🙂
b70ffd4 to
2c22e14
Compare
|
/merge |
Answers #1279