-
Notifications
You must be signed in to change notification settings - Fork 184
Add support for PQ preprocessing API #1278
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
rapids-bot
merged 90 commits into
rapidsai:release/26.02
from
lowener:25.10-pq-preprocessing
Jan 23, 2026
Merged
Changes from all commits
Commits
Show all changes
90 commits
Select commit
Hold shift + click to select a range
cfe4f92
Initial commit for PQ preprocessing API
lowener 096daa5
Support `n_lists` and cleanup code
lowener 244a9cd
Switch to VPQ
lowener 9537eb1
Fix trainpq and train workflow
lowener 2883a25
Remove timer
lowener 78dfd69
Merge branch 'branch-25.10' into 25.10-pq-preprocessing
lowener 9dd0cfe
Cleanup Code
lowener 9c543fb
Add double dtype
lowener 5471d9a
Add C and python API
lowener 716fa58
Merge branch 'branch-25.10' into 25.10-pq-preprocessing
lowener 6d6d4ca
Merge branch 'branch-25.10' into 25.10-pq-preprocessing
KyleFromNVIDIA 746cac4
Make VQ optional
lowener 1950da4
Add option for classical KMeans
lowener 75629e5
Add kmeans option to python
lowener 32c8912
Merge branch 'branch-25.10' into 25.10-pq-preprocessing
lowener d774999
Add getter for pq codebooks
lowener a55df82
Fix doc
lowener 5a151f9
Fix reconstruct kernel
lowener a0c5071
Merge branch 'branch-25.12' into 25.10-pq-preprocessing
lowener 7112efe
Add Vector Quantization
lowener 407b500
Merge branch 'branch-25.12' into 25.10-pq-preprocessing
lowener 51d9c94
Merge C API with latest changes
lowener 37d9f7c
Add VQ to C/Python API
lowener eea1421
Address reviews on struct/enum declaration
lowener a73809c
Fix params order
lowener 7bd8f17
Update C/Python doc
lowener 3e8ef62
Merge branch 'branch-25.12' into 25.10-pq-preprocessing
lowener f5bf4ea
Merge branch 'main' into 25.10-pq-preprocessing
lowener cebc548
Improve docs
lowener f6b7829
Update copyright
lowener f8fd16e
Merge branch 'main' into 25.10-pq-preprocessing
lowener f04545a
Fix compilation
lowener 487376c
Switch namespace to pq
lowener 03f8761
Add shared mem to compute_code PQ kernel
lowener 9ebca8a
Add use_pq parameter
lowener a2c833b
Add subspace option for PQ
lowener bf68702
Merge branch 'main' into 25.10-pq-preprocessing
lowener 294db1c
Remove double+simplify train
lowener 746961d
Optimize train steps, use build cluster for km balanced
lowener cf6d482
Simplify subspace build loop
lowener 6cdfa9d
Split PQ params
lowener 67a5997
Merge branch 'main' into 25.10-pq-preprocessing
lowener 8d12d2d
Add extreme cpp test cases, Add support for host dataset
lowener ab0fa28
Merge branch 'main' into 25.10-pq-preprocessing
lowener d4a46fa
Fix compilation mdspan changes
lowener 377b908
Merge branch 'main' into 25.10-pq-preprocessing
lowener b13f9f0
Add c header to all
lowener 5f5791a
Add pool allocator in example
lowener 22e83e6
Update python docstring
lowener fe6753e
Fix test when nrows == n_centers for VPQ
lowener d05c85c
Fix train conditions
lowener cc90c75
Revert "Fix test when nrows == n_centers for VPQ"
lowener d8e6c84
Fix doc
lowener c850df7
Merge branch 'main' into 25.10-pq-preprocessing
lowener dcd8380
Merge branch 'main' into 25.10-pq-preprocessing
lowener 86465ff
Cooperative load + prefetch
lowener 9bfe19e
Compute by chunk of 4
lowener 4730df6
Fix math_t data_t
lowener 0f17a0c
Add pq_bits support of [8-16]. Remove it as a template
lowener 794e5b9
Merge branch 'main' into 25.10-pq-preprocessing
lowener bf238fd
Fix comment
lowener cb4780d
Fix copyright
lowener 0f05512
Fix copyright 2
lowener b374931
Fix vamana header
lowener c122228
Simplify code and add helper function
lowener 06ddb89
Remove copy_vectorized, and direct intrinsics calls. Simplify bitfield
lowener b9675e8
Add float2 vectorization
lowener d97f9b7
Merge branch 'main' into 25.10-pq-preprocessing
lowener 6b9126e
Default std optional to nullopt
lowener faa7659
Merge branch 'main' into 25.10-pq-preprocessing
lowener 05bf256
Fix reconstruct tpb and logic in shared_memory handling of non-subspa…
lowener 0d76786
Fix shared mem for very large pq_len
lowener e0805da
Merge branch 'main' into 25.10-pq-preprocessing
cjnolet 261a4b4
Fix misaligned address for vectorized load
lowener f0d8061
Modify params struct + deprecate trainset_fraction
lowener c56a8c4
Merge branch 'main' into 25.10-pq-preprocessing
lowener 2ddae5c
Simplify use of optional, change kmeans_type to avoid name conflicts
lowener 8052db3
Spectral Embedding with `all_neighbors` (#1693)
aamijar c927dec
Deduplicate `calc_chunk_indices_kernel` (#1657)
jinsolp 60ebe3f
Prepare release/26.02
AyodeAwe f198537
wheel builds: react to changes in pip's handling of build constraints…
mmccarty 4bb9435
Use raft::TxN_t
lowener 2e13085
Use separate vector for VQ labels, switch Vamana to public PQ API
lowener 081254c
Add note on doc
lowener a69e2cf
Add issue #
lowener 6e77ab0
Fix doc
lowener 841a3a9
pre-built libcuvs_c.so now use the new ABI major/minor values (#1708)
robertmaynard b95eb46
Correct base release for cuvs abi 1 major (#1724)
robertmaynard 809bacb
Add new option to ann-bench
lowener 3116fe3
Merge branch 'release/26.02' into 25.10-pq-preprocessing
lowener File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,227 @@ | ||
| /* | ||
| * SPDX-FileCopyrightText: Copyright (c) 2025-2026, NVIDIA CORPORATION. | ||
| * SPDX-License-Identifier: Apache-2.0 | ||
| */ | ||
|
|
||
| #pragma once | ||
|
|
||
| #include <cuvs/cluster/kmeans.h> | ||
| #include <cuvs/core/c_api.h> | ||
| #include <dlpack/dlpack.h> | ||
| #include <stdint.h> | ||
|
|
||
| #ifdef __cplusplus | ||
| extern "C" { | ||
| #endif | ||
|
|
||
| /** | ||
| * @defgroup preprocessing_c_pq C API for Product Quantizer | ||
| * @{ | ||
| */ | ||
| /** | ||
| * @brief Product quantizer parameters. | ||
| */ | ||
| struct cuvsProductQuantizerParams { | ||
| /** | ||
| * The bit length of the vector element after compression by PQ. | ||
| * | ||
| * Possible values: within [4, 16]. | ||
| * | ||
| * Hint: the smaller the 'pq_bits', the smaller the index size and the better the search | ||
| * performance, but the lower the recall. | ||
| */ | ||
| uint32_t pq_bits; | ||
| /** | ||
| * The dimensionality of the vector after compression by PQ. | ||
| * When zero, an optimal value is selected using a heuristic. | ||
| * | ||
| * TODO: at the moment `dim` must be a multiple `pq_dim`. | ||
| */ | ||
| uint32_t pq_dim; | ||
| /** | ||
| * Whether to use subspaces for product quantization (PQ). | ||
| * When true, one PQ codebook is used for each subspace. Otherwise, a single | ||
| * PQ codebook is used. | ||
| */ | ||
| bool use_subspaces; | ||
| /** | ||
| * Whether to use Vector Quantization (KMeans) before product quantization (PQ). | ||
| * When true, VQ is used before PQ. When false, only product quantization is used. | ||
| */ | ||
| bool use_vq; | ||
| /** | ||
| * Vector Quantization (VQ) codebook size - number of "coarse cluster centers". | ||
| * When zero, an optimal value is selected using a heuristic. | ||
| * When one, only product quantization is used. | ||
| */ | ||
| uint32_t vq_n_centers; | ||
| /** The number of iterations searching for kmeans centers (both VQ & PQ phases). */ | ||
| uint32_t kmeans_n_iters; | ||
| /** | ||
| * The type of kmeans algorithm to use for PQ training. | ||
| */ | ||
| cuvsKMeansType pq_kmeans_type; | ||
| /** | ||
| * The max number of data points to use per PQ code during PQ codebook training. Using more data | ||
| * points per PQ code may increase the quality of PQ codebook but may also increase the build | ||
| * time. We will use `pq_n_centers * max_train_points_per_pq_code` training | ||
| * points to train each PQ codebook. | ||
| */ | ||
| uint32_t max_train_points_per_pq_code; | ||
| /** | ||
| * The max number of data points to use per VQ cluster. | ||
| */ | ||
| uint32_t max_train_points_per_vq_cluster; | ||
| }; | ||
|
|
||
| typedef struct cuvsProductQuantizerParams* cuvsProductQuantizerParams_t; | ||
|
|
||
| /** | ||
| * @brief Allocate Product Quantizer params, and populate with default values | ||
| * | ||
| * @param[in] params cuvsProductQuantizerParams_t to allocate | ||
| * @return cuvsError_t | ||
| */ | ||
| cuvsError_t cuvsProductQuantizerParamsCreate(cuvsProductQuantizerParams_t* params); | ||
|
|
||
| /** | ||
| * @brief De-allocate Product Quantizer params | ||
| * | ||
| * @param[in] params | ||
| * @return cuvsError_t | ||
| */ | ||
| cuvsError_t cuvsProductQuantizerParamsDestroy(cuvsProductQuantizerParams_t params); | ||
|
|
||
| /** | ||
| * @brief Defines and stores product quantizer upon training | ||
| * | ||
| * The quantization is performed by a linear mapping of an interval in the | ||
| * float data type to the full range of the quantized int type. | ||
| */ | ||
| typedef struct { | ||
| uintptr_t addr; | ||
| DLDataType dtype; | ||
| } cuvsProductQuantizer; | ||
|
|
||
| typedef cuvsProductQuantizer* cuvsProductQuantizer_t; | ||
|
|
||
| /** | ||
| * @brief Allocate Product Quantizer | ||
| * | ||
| * @param[in] quantizer cuvsProductQuantizer_t to allocate | ||
| * @return cuvsError_t | ||
| */ | ||
| cuvsError_t cuvsProductQuantizerCreate(cuvsProductQuantizer_t* quantizer); | ||
|
|
||
| /** | ||
| * @brief De-allocate Product Quantizer | ||
| * | ||
| * @param[in] quantizer | ||
| * @return cuvsError_t | ||
| */ | ||
| cuvsError_t cuvsProductQuantizerDestroy(cuvsProductQuantizer_t quantizer); | ||
|
|
||
| /** | ||
| * @brief Builds a product quantizer to be used later for quantizing the dataset. | ||
| * | ||
| * @param[in] res raft resource | ||
| * @param[in] params Parameters for product quantizer training | ||
| * @param[in] dataset a row-major host or device matrix | ||
| * @param[out] quantizer trained product quantizer | ||
| */ | ||
| cuvsError_t cuvsProductQuantizerBuild(cuvsResources_t res, | ||
| cuvsProductQuantizerParams_t params, | ||
| DLManagedTensor* dataset, | ||
| cuvsProductQuantizer_t quantizer); | ||
|
|
||
| /** | ||
| * @brief Applies product quantization transform to the given dataset | ||
| * | ||
| * This applies product quantization to a dataset. | ||
| * | ||
| * @param[in] res raft resource | ||
| * @param[in] quantizer product quantizer | ||
| * @param[in] dataset a row-major host or device matrix to transform | ||
| * @param[out] codes_out a row-major device matrix to store transformed data | ||
| * @param[out] vq_labels a device vector to store VQ labels. | ||
| * Optional, can be NULL. | ||
| */ | ||
| cuvsError_t cuvsProductQuantizerTransform(cuvsResources_t res, | ||
| cuvsProductQuantizer_t quantizer, | ||
| DLManagedTensor* dataset, | ||
| DLManagedTensor* codes_out, | ||
| DLManagedTensor* vq_labels); | ||
|
|
||
| /** | ||
| * @brief Applies product quantization inverse transform to the given quantized codes | ||
| * | ||
| * This applies product quantization inverse transform to the given quantized codes. | ||
| * | ||
| * @param[in] res raft resource | ||
| * @param[in] quantizer product quantizer | ||
| * @param[in] pq_codes a row-major device matrix of quantized codes | ||
| * @param[out] out a row-major device matrix to store the original data | ||
| * @param[out] vq_labels a device vector containing the VQ labels when VQ is used. | ||
| * Optional, can be NULL. | ||
| */ | ||
| cuvsError_t cuvsProductQuantizerInverseTransform(cuvsResources_t res, | ||
| cuvsProductQuantizer_t quantizer, | ||
| DLManagedTensor* pq_codes, | ||
| DLManagedTensor* out, | ||
| DLManagedTensor* vq_labels); | ||
|
|
||
| /** | ||
| * @brief Get the bit length of the vector element after compression by PQ. | ||
| * | ||
| * @param[in] quantizer product quantizer | ||
| * @param[out] pq_bits bit length of the vector element after compression by PQ | ||
| */ | ||
| cuvsError_t cuvsProductQuantizerGetPqBits(cuvsProductQuantizer_t quantizer, uint32_t* pq_bits); | ||
|
|
||
| /** | ||
| * @brief Get the dimensionality of the vector after compression by PQ. | ||
| * | ||
| * @param[in] quantizer product quantizer | ||
| * @param[out] pq_dim dimensionality of the vector after compression by PQ | ||
| */ | ||
| cuvsError_t cuvsProductQuantizerGetPqDim(cuvsProductQuantizer_t quantizer, uint32_t* pq_dim); | ||
|
|
||
| /** | ||
| * @brief Get the PQ codebook. | ||
| * | ||
| * @param[in] quantizer product quantizer | ||
| * @param[out] pq_codebook PQ codebook | ||
| */ | ||
| cuvsError_t cuvsProductQuantizerGetPqCodebook(cuvsProductQuantizer_t quantizer, | ||
| DLManagedTensor* pq_codebook); | ||
|
|
||
| /** | ||
| * @brief Get the VQ codebook. | ||
| * | ||
| * @param[in] quantizer product quantizer | ||
| * @param[out] vq_codebook VQ codebook | ||
| */ | ||
| cuvsError_t cuvsProductQuantizerGetVqCodebook(cuvsProductQuantizer_t quantizer, | ||
| DLManagedTensor* vq_codebook); | ||
| /** | ||
| * @brief Get the encoded dimension of the quantized dataset. | ||
| * | ||
| * @param[in] quantizer product quantizer | ||
| * @param[out] encoded_dim encoded dimension of the quantized dataset | ||
| */ | ||
| cuvsError_t cuvsProductQuantizerGetEncodedDim(cuvsProductQuantizer_t quantizer, | ||
| uint32_t* encoded_dim); | ||
|
|
||
| /** | ||
| * @brief Get whether VQ is used. | ||
| * | ||
| * @param[in] quantizer product quantizer | ||
| * @param[out] use_vq whether VQ is used | ||
| */ | ||
| cuvsError_t cuvsProductQuantizerGetUseVq(cuvsProductQuantizer_t quantizer, bool* use_vq); | ||
| /** | ||
| * @} | ||
| */ | ||
| #ifdef __cplusplus | ||
| } | ||
| #endif | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.