Add support for PQ preprocessing API#1278
Add support for PQ preprocessing API#1278rapids-bot[bot] merged 90 commits intorapidsai:release/26.02from
Conversation
Signed-off-by: Mickael Ide <[email protected]>
|
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
Signed-off-by: Mickael Ide <[email protected]>
Signed-off-by: Mickael Ide <[email protected]>
Signed-off-by: Mickael Ide <[email protected]>
Signed-off-by: Mickael Ide <[email protected]>
Signed-off-by: Mickael Ide <[email protected]>
Signed-off-by: Mickael Ide <[email protected]>
KyleFromNVIDIA
left a comment
There was a problem hiding this comment.
Approved trivial CMake changes
Instead of using `cuvs::neighbors::brute_force::build` and `cuvs::neighbors::brute_force::search` we can consolidate to use the `cuvs::neighbors::all_neighbors::build` API. Authors: - Anupam (https://github.com/aamijar) Approvers: - Tarang Jain (https://github.com/tarang-jain) - Jinsol Park (https://github.com/jinsolp) - Victor Lafargue (https://github.com/viclafargue) URL: rapidsai#1693
Closes rapidsai#1577 Reduces binary size by deduplicating `calc_chunk_indices_kernel`. This PR reduces instantiations from 62 -> 1 for each template (`BlockDim`=32, 64, ..., 1024) ### Binary Size Changes CUDA 12.9: 1096.15MB -> CUDA 13: 432.98 MB-> Authors: - Jinsol Park (https://github.com/jinsolp) Approvers: - Divye Gala (https://github.com/divyegala) - Robert Maynard (https://github.com/robertmaynard) URL: rapidsai#1657
…rapidsai#1710) Contributes to rapidsai/build-planning#242 Modifying `ci/build_wheel.sh` to - pass`--build-constraint="${PIP_CONSTRAINT}"` unless build isolation is enabled. - unset `PIP_CONSTRAINT` (set by rapids-init-pip)... it doesn't affect builds as of pip 25.3, and results in an error from `pip wheel` when set and `--build-constraint` is also passed Authors: - Mike McCarty (https://github.com/mmccarty) Approvers: - James Lamb (https://github.com/jameslamb) URL: rapidsai#1710
Signed-off-by: Mickael Ide <[email protected]>
Signed-off-by: Mickael Ide <[email protected]>
Signed-off-by: Mickael Ide <[email protected]>
Signed-off-by: Mickael Ide <[email protected]>
Signed-off-by: Mickael Ide <[email protected]>
…ai#1708) Based on the in-development cuVS ABI stability docs we now encode the C API's ABI stability in the SOVERSION Authors: - Robert Maynard (https://github.com/robertmaynard) Approvers: - Bradley Dice (https://github.com/bdice) - Dante Gama Dessavre (https://github.com/dantegd) URL: rapidsai#1708
Correct the libcuvs_c.so ABI major to start at 26.02 Authors: - Robert Maynard (https://github.com/robertmaynard) Approvers: - Dante Gama Dessavre (https://github.com/dantegd) URL: rapidsai#1724
Signed-off-by: Mickael Ide <[email protected]>
70a2564 to
809bacb
Compare
|
I don't see changes to the Java code here, so I'm reviewing these from the ABI perspective. Will we be addressing this comment about enum names, in this PR/release? I ask because I didn't see a PR attached to #1717. |
|
@mythrocks A force push triggered a request for re-review on all files. |
|
/merge |
| } | ||
|
|
||
| /** | ||
| * @brief Applies inverse quantization transform to given dataset |
There was a problem hiding this comment.
| * @brief Applies inverse quantization transform to given dataset | |
| * @brief Applies inverse quantization transform to given set of encoded vectors |
| /** | ||
| * @brief Type of k-means algorithm. | ||
| */ | ||
| typedef enum { KMeans = 0, KMeansBalanced = 1 } cuvsKMeansType; |
There was a problem hiding this comment.
Future task- we should update cuvsKMeansParams_t to use this and remove the bool hierarchical flag. Can you create an issue for this just so we don't forget it? It doesn't have to be done now, especially since we're striving to maintain ABI compatibility in our C APIs (we need to start deprecating breaking changes).
d1b11d5
into
rapidsai:release/26.02
This PR brings new params to ivf_pq: an option for the user to choose the layout of the ivf lists. The lists can be flat (no interleaving) or interleaved (current default). Flat codes allows building the index in a CPU-compatible format. [UPDATE as of 12/19/2025]: After #1278 is merged, we can unify IVF-PQ and PQ API codepaths. [UPDATE 01/08/2026]: This PR can be merged before #1278. The flat code-writing can potentially be reverted once #1278 is merged (so we can later use the PQ preprocessing API directly). However that will come naturally as a part if a broader unification of IVF-PQ and PQ codepaths. [Benchmarks 01/15/2026]: ## IVF-PQ Layout Benchmark Results **Dataset**: 1,000,000 vectors × 128 dimensions | **pq_dim**: 32 pq_bits | Code Size | Direct FLAT Build (ms) | INTERLEAVED Build (ms) | Convert INTERLEAVED to FLAT with Codepacker (ms) | Total time for INTERLEAVED build + Conversion to FLAT with Codepacker (unpack) (ms) | Overhead | |:-------:|:---------:|:---------------:|:----------------------:|:-------------------:|:----------------------:|:--------:| | 8 | 32 bytes | 372.46 | 385.86 | 985.28 | 1371.13 | 3.68× | | 6 | 24 bytes | 298.83 | 300.99 | 961.82 | 1262.82 | 4.23× | | 5 | 20 bytes | 283.25 | 281.95 | 795.43 | 1077.38 | 3.80× | | 4 | 16 bytes | 270.63 | 271.01 | 489.73 | 760.75 | 2.81× | Authors: - Tarang Jain (https://github.com/tarang-jain) Approvers: - Corey J. Nolet (https://github.com/cjnolet) - Robert Maynard (https://github.com/robertmaynard) URL: #1607
Closes rapidsai#107 This PR adds support for a PQ preprocessing API. It gives access to `train()`, `transform()` and `inverse_transform()` function that can be used to transform a dataset into PQ codes. It is re-using the VPQ functions from CAGRA-Q. Authors: - Micka (https://github.com/lowener) - Kyle Edwards (https://github.com/KyleFromNVIDIA) - Corey J. Nolet (https://github.com/cjnolet) - Anupam (https://github.com/aamijar) - Jinsol Park (https://github.com/jinsolp) - Jake Awe (https://github.com/AyodeAwe) - Mike McCarty (https://github.com/mmccarty) - Robert Maynard (https://github.com/robertmaynard) Approvers: - Kyle Edwards (https://github.com/KyleFromNVIDIA) - Robert Maynard (https://github.com/robertmaynard) - Tamas Bela Feher (https://github.com/tfeher) - Corey J. Nolet (https://github.com/cjnolet) URL: rapidsai#1278
This PR brings new params to ivf_pq: an option for the user to choose the layout of the ivf lists. The lists can be flat (no interleaving) or interleaved (current default). Flat codes allows building the index in a CPU-compatible format. [UPDATE as of 12/19/2025]: After rapidsai#1278 is merged, we can unify IVF-PQ and PQ API codepaths. [UPDATE 01/08/2026]: This PR can be merged before rapidsai#1278. The flat code-writing can potentially be reverted once rapidsai#1278 is merged (so we can later use the PQ preprocessing API directly). However that will come naturally as a part if a broader unification of IVF-PQ and PQ codepaths. [Benchmarks 01/15/2026]: ## IVF-PQ Layout Benchmark Results **Dataset**: 1,000,000 vectors × 128 dimensions | **pq_dim**: 32 pq_bits | Code Size | Direct FLAT Build (ms) | INTERLEAVED Build (ms) | Convert INTERLEAVED to FLAT with Codepacker (ms) | Total time for INTERLEAVED build + Conversion to FLAT with Codepacker (unpack) (ms) | Overhead | |:-------:|:---------:|:---------------:|:----------------------:|:-------------------:|:----------------------:|:--------:| | 8 | 32 bytes | 372.46 | 385.86 | 985.28 | 1371.13 | 3.68× | | 6 | 24 bytes | 298.83 | 300.99 | 961.82 | 1262.82 | 4.23× | | 5 | 20 bytes | 283.25 | 281.95 | 795.43 | 1077.38 | 3.80× | | 4 | 16 bytes | 270.63 | 271.01 | 489.73 | 760.75 | 2.81× | Authors: - Tarang Jain (https://github.com/tarang-jain) Approvers: - Corey J. Nolet (https://github.com/cjnolet) - Robert Maynard (https://github.com/robertmaynard) URL: rapidsai#1607
Follow-up to the PQ PR #1278 . Closes #1575 Closes #1747 This PR removes the need to compile multiple times the same code for PQ in CAGRA-Q and SCANN, removing code duplication and improving build time. CAGRA-Q can't use the new public API since it is using half for its math type so an private API function is used. A small test is added to SCANN to make sure that the returned index is not complete garbage but more testing should be done there. (Created issue #1747 to track this) This PR saves ~2-3 Mb on libcuvs.so compiled on a single architecture (141 Mb -> 138Mb) Authors: - Micka (https://github.com/lowener) Approvers: - Corey J. Nolet (https://github.com/cjnolet) URL: #1746
Follow-up to the PQ PR rapidsai#1278 . Closes rapidsai#1575 Closes rapidsai#1747 This PR removes the need to compile multiple times the same code for PQ in CAGRA-Q and SCANN, removing code duplication and improving build time. CAGRA-Q can't use the new public API since it is using half for its math type so an private API function is used. A small test is added to SCANN to make sure that the returned index is not complete garbage but more testing should be done there. (Created issue rapidsai#1747 to track this) This PR saves ~2-3 Mb on libcuvs.so compiled on a single architecture (141 Mb -> 138Mb) Authors: - Micka (https://github.com/lowener) Approvers: - Corey J. Nolet (https://github.com/cjnolet) URL: rapidsai#1746
Closes #107
This PR adds support for a PQ preprocessing API. It gives access to
train(),transform()andinverse_transform()function that can be used to transform a dataset into PQ codes. It is re-using the VPQ functions from CAGRA-Q.