Migrate IVF-PQ from RAFT to cuVS#86
Conversation
Signed-off-by: Mickael Ide <mide@nvidia.com>
Signed-off-by: Mickael Ide <mide@nvidia.com>
Signed-off-by: Mickael Ide <mide@nvidia.com>
Signed-off-by: Mickael Ide <mide@nvidia.com>
|
/ok to test |
Signed-off-by: Mickael Ide <mide@nvidia.com>
|
/ok to test |
|
/ok to test |
Signed-off-by: Mickael Ide <mide@nvidia.com>
|
/ok to test |
cjnolet
left a comment
There was a problem hiding this comment.
I think this is a good start, but the layout and design of cuvs code is fundamentally quite different than what we've been doing in RAFT because it's not a header-only library. As such, we need to treat the src/ and include/ directories as a first-class C++ library, which means defining a very clear separation between interface and implementation. The include/ directory should contain only definitions and interfaces and all implementations should be compiled into source files in src/. Any and all implementation headers that are not part of the user-visible public APIs belong in src/.
| * | ||
| * http://www.apache.org/licenses/LICENSE-2.0 | ||
| * | ||
| * Unless required by applicable law or agreed to in writing, software |
There was a problem hiding this comment.
We should consider keeping this in raft with the other vocabulary types so it can continue to be used across different libraries that use raft. Also- we can't be defining device functions in an hpp file in cuVS, since it's not header-only. The only APIs users should be interacting with in cuVS should be pre-compiled runtime APIs.
| * you may not use this file except in compliance with the License. | ||
| * You may obtain a copy of the License at | ||
| * | ||
| * http://www.apache.org/licenses/LICENSE-2.0 |
There was a problem hiding this comment.
This shouldn't be a user-facing API and thus shouldn't be defined in include/. We should no longer be using include/ for headers just because they need to be used across cuvs. Use src/ for those headers. Only headers in include/ should be those that we expect the user to interact with directly (thus the number of headers in include/ should be very small.
There was a problem hiding this comment.
Having all of the declaration in src/ will lead to #include with long relative paths that can be problematic in the future.
To avoid that we can add the src/ directory to CMake target_include_directories() but this would lead to includes such as src/core/nvtx.hpp. Are we fine with that or do we want to keep something like cuvs/core/nvtx.hpp?
There was a problem hiding this comment.
I've done relative paths for cagra so far and haven't seen them to be too terrible. Is ivf-pq somehow making it more challenging to work with?
From a development perspective, I generally tend to prefer the use of the relative quotationed paths for things that are local to src/, (and thus internal) rather than muddying the line between the two and making it harder to discern which things are public APIs and which aren't.
| template <typename filter_t, typename = void> | ||
| struct takes_three_args : std::false_type {}; | ||
| template <typename filter_t> | ||
| struct takes_three_args< |
There was a problem hiding this comment.
Definitino in include/, implementation compiled into source file(s) in src/.
| return raft::make_device_vector_view<const bitset_t, index_t>(bitset_ptr_, n_elements()); | ||
| } | ||
|
|
||
| template <typename bitset_t, typename index_t> |
There was a problem hiding this comment.
I don't mind compiling bitset into cuvs, but i still think much of the implementation should live in headers in raft. I think what we can do is use the using clause in order to expose a cuvs::bitset while still having the bitset code officially live in raft. This just makes things more flexible for users of raft.
|
/ok to test |
…-2406-cagra_from_raft
|
/ok to test |
|
/ok to test |
|
/ok to test |
|
/ok to test |
1 similar comment
|
/ok to test |
|
/ok to test |
2 similar comments
|
/ok to test |
|
/ok to test |
|
/ok to test |
|
/ok to test |
|
/ok to test |
| * @param[in] label the id of the target list (cluster). | ||
| */ | ||
| void erase_list(raft::resources const& res, index<int64_t>* index, uint32_t label); | ||
|
|
There was a problem hiding this comment.
Looks like we are missing the reset_index helper. That helper is needed by FAISS. It can also be moved to the index class instead of being an external helper.
There was a problem hiding this comment.
We are missing unpack_contiguous_list_data and recompute_internal_state in the helpers.
There was a problem hiding this comment.
extract_centers is missing.
There was a problem hiding this comment.
Can you ensure that all the helpers have been migrated? They are essential for FAISS' future dependence on cuVS.
There was a problem hiding this comment.
cc @lowener- we will need to add these in follow-on PRs. We needed to merge this PR for other reasons.
|
/merge |
List of changes I made during the migration:
cuvs/neighbors/ivf_pq_helpers.hpp)search_with_filteris added to the public API, with bitset_filter