Skip to content
Merged
Show file tree
Hide file tree
Changes from 68 commits
Commits
Show all changes
90 commits
Select commit Hold shift + click to select a range
cfe4f92
Initial commit for PQ preprocessing API
lowener Aug 22, 2025
096daa5
Support `n_lists` and cleanup code
lowener Aug 25, 2025
244a9cd
Switch to VPQ
lowener Sep 22, 2025
9537eb1
Fix trainpq and train workflow
lowener Sep 23, 2025
2883a25
Remove timer
lowener Sep 23, 2025
78dfd69
Merge branch 'branch-25.10' into 25.10-pq-preprocessing
lowener Sep 23, 2025
9dd0cfe
Cleanup Code
lowener Sep 23, 2025
9c543fb
Add double dtype
lowener Sep 24, 2025
5471d9a
Add C and python API
lowener Sep 26, 2025
716fa58
Merge branch 'branch-25.10' into 25.10-pq-preprocessing
lowener Sep 26, 2025
6d6d4ca
Merge branch 'branch-25.10' into 25.10-pq-preprocessing
KyleFromNVIDIA Sep 26, 2025
746cac4
Make VQ optional
lowener Sep 29, 2025
1950da4
Add option for classical KMeans
lowener Sep 29, 2025
75629e5
Add kmeans option to python
lowener Sep 30, 2025
32c8912
Merge branch 'branch-25.10' into 25.10-pq-preprocessing
lowener Oct 1, 2025
d774999
Add getter for pq codebooks
lowener Sep 30, 2025
a55df82
Fix doc
lowener Oct 1, 2025
5a151f9
Fix reconstruct kernel
lowener Oct 2, 2025
a0c5071
Merge branch 'branch-25.12' into 25.10-pq-preprocessing
lowener Oct 6, 2025
7112efe
Add Vector Quantization
lowener Oct 14, 2025
407b500
Merge branch 'branch-25.12' into 25.10-pq-preprocessing
lowener Oct 14, 2025
51d9c94
Merge C API with latest changes
lowener Oct 14, 2025
37d9f7c
Add VQ to C/Python API
lowener Oct 16, 2025
eea1421
Address reviews on struct/enum declaration
lowener Oct 16, 2025
a73809c
Fix params order
lowener Oct 16, 2025
7bd8f17
Update C/Python doc
lowener Oct 20, 2025
3e8ef62
Merge branch 'branch-25.12' into 25.10-pq-preprocessing
lowener Oct 20, 2025
f5bf4ea
Merge branch 'main' into 25.10-pq-preprocessing
lowener Oct 28, 2025
cebc548
Improve docs
lowener Oct 28, 2025
f6b7829
Update copyright
lowener Oct 28, 2025
f8fd16e
Merge branch 'main' into 25.10-pq-preprocessing
lowener Oct 29, 2025
f04545a
Fix compilation
lowener Nov 3, 2025
487376c
Switch namespace to pq
lowener Nov 6, 2025
03f8761
Add shared mem to compute_code PQ kernel
lowener Nov 6, 2025
9ebca8a
Add use_pq parameter
lowener Nov 10, 2025
a2c833b
Add subspace option for PQ
lowener Nov 18, 2025
bf68702
Merge branch 'main' into 25.10-pq-preprocessing
lowener Nov 20, 2025
294db1c
Remove double+simplify train
lowener Nov 21, 2025
746961d
Optimize train steps, use build cluster for km balanced
lowener Nov 21, 2025
cf6d482
Simplify subspace build loop
lowener Nov 26, 2025
6cdfa9d
Split PQ params
lowener Dec 1, 2025
67a5997
Merge branch 'main' into 25.10-pq-preprocessing
lowener Dec 9, 2025
8d12d2d
Add extreme cpp test cases, Add support for host dataset
lowener Dec 10, 2025
ab0fa28
Merge branch 'main' into 25.10-pq-preprocessing
lowener Dec 12, 2025
d4a46fa
Fix compilation mdspan changes
lowener Dec 16, 2025
377b908
Merge branch 'main' into 25.10-pq-preprocessing
lowener Dec 16, 2025
b13f9f0
Add c header to all
lowener Dec 16, 2025
5f5791a
Add pool allocator in example
lowener Dec 16, 2025
22e83e6
Update python docstring
lowener Dec 16, 2025
fe6753e
Fix test when nrows == n_centers for VPQ
lowener Dec 16, 2025
d05c85c
Fix train conditions
lowener Dec 16, 2025
cc90c75
Revert "Fix test when nrows == n_centers for VPQ"
lowener Dec 16, 2025
d8e6c84
Fix doc
lowener Dec 17, 2025
c850df7
Merge branch 'main' into 25.10-pq-preprocessing
lowener Dec 18, 2025
dcd8380
Merge branch 'main' into 25.10-pq-preprocessing
lowener Dec 23, 2025
86465ff
Cooperative load + prefetch
lowener Dec 24, 2025
9bfe19e
Compute by chunk of 4
lowener Dec 24, 2025
4730df6
Fix math_t data_t
lowener Dec 26, 2025
0f17a0c
Add pq_bits support of [8-16]. Remove it as a template
lowener Jan 5, 2026
794e5b9
Merge branch 'main' into 25.10-pq-preprocessing
lowener Jan 5, 2026
bf238fd
Fix comment
lowener Jan 5, 2026
cb4780d
Fix copyright
lowener Jan 5, 2026
0f05512
Fix copyright 2
lowener Jan 5, 2026
b374931
Fix vamana header
lowener Jan 5, 2026
c122228
Simplify code and add helper function
lowener Jan 8, 2026
06ddb89
Remove copy_vectorized, and direct intrinsics calls. Simplify bitfield
lowener Jan 9, 2026
b9675e8
Add float2 vectorization
lowener Jan 9, 2026
d97f9b7
Merge branch 'main' into 25.10-pq-preprocessing
lowener Jan 9, 2026
6b9126e
Default std optional to nullopt
lowener Jan 12, 2026
faa7659
Merge branch 'main' into 25.10-pq-preprocessing
lowener Jan 12, 2026
05bf256
Fix reconstruct tpb and logic in shared_memory handling of non-subspa…
lowener Jan 12, 2026
0d76786
Fix shared mem for very large pq_len
lowener Jan 12, 2026
e0805da
Merge branch 'main' into 25.10-pq-preprocessing
cjnolet Jan 13, 2026
261a4b4
Fix misaligned address for vectorized load
lowener Jan 14, 2026
f0d8061
Modify params struct + deprecate trainset_fraction
lowener Jan 15, 2026
c56a8c4
Merge branch 'main' into 25.10-pq-preprocessing
lowener Jan 15, 2026
2ddae5c
Simplify use of optional, change kmeans_type to avoid name conflicts
lowener Jan 21, 2026
8052db3
Spectral Embedding with `all_neighbors` (#1693)
aamijar Jan 15, 2026
c927dec
Deduplicate `calc_chunk_indices_kernel` (#1657)
jinsolp Jan 16, 2026
60ebe3f
Prepare release/26.02
AyodeAwe Jan 16, 2026
f198537
wheel builds: react to changes in pip's handling of build constraints…
mmccarty Jan 16, 2026
4bb9435
Use raft::TxN_t
lowener Jan 21, 2026
2e13085
Use separate vector for VQ labels, switch Vamana to public PQ API
lowener Jan 22, 2026
081254c
Add note on doc
lowener Jan 22, 2026
a69e2cf
Add issue #
lowener Jan 22, 2026
6e77ab0
Fix doc
lowener Jan 22, 2026
841a3a9
pre-built libcuvs_c.so now use the new ABI major/minor values (#1708)
robertmaynard Jan 22, 2026
b95eb46
Correct base release for cuvs abi 1 major (#1724)
robertmaynard Jan 23, 2026
809bacb
Add new option to ann-bench
lowener Jan 23, 2026
3116fe3
Merge branch 'release/26.02' into 25.10-pq-preprocessing
lowener Jan 23, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion c/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# =============================================================================
# cmake-format: off
# SPDX-FileCopyrightText: Copyright (c) 2025, NVIDIA CORPORATION.
# SPDX-FileCopyrightText: Copyright (c) 2025-2026, NVIDIA CORPORATION.
# SPDX-License-Identifier: Apache-2.0
# cmake-format: on
# =============================================================================
Expand Down Expand Up @@ -100,6 +100,7 @@ add_library(
src/neighbors/tiered_index.cpp
src/neighbors/all_neighbors.cpp
src/preprocessing/quantize/binary.cpp
src/preprocessing/quantize/pq.cpp
src/preprocessing/quantize/scalar.cpp
src/distance/pairwise_distance.cpp
)
Expand Down
7 changes: 6 additions & 1 deletion c/include/cuvs/cluster/kmeans.h
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/*
* SPDX-FileCopyrightText: Copyright (c) 2025, NVIDIA CORPORATION.
* SPDX-FileCopyrightText: Copyright (c) 2025-2026, NVIDIA CORPORATION.
* SPDX-License-Identifier: Apache-2.0
*/

Expand Down Expand Up @@ -121,6 +121,11 @@ cuvsError_t cuvsKMeansParamsCreate(cuvsKMeansParams_t* params);
*/
cuvsError_t cuvsKMeansParamsDestroy(cuvsKMeansParams_t params);

/**
* @brief Type of k-means algorithm.
*/
typedef enum { KMeans = 0, KMeansBalanced = 1 } cuvsKMeansType;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Future task- we should update cuvsKMeansParams_t to use this and remove the bool hierarchical flag. Can you create an issue for this just so we don't forget it? It doesn't have to be done now, especially since we're striving to maintain ABI compatibility in our C APIs (we need to start deprecating breaking changes).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This introduces two new symbols to the global namespace: Kmeans and KmeansBalanced - and could cause symbol collisions for people including this header.

Perhaps we should rename to something likecuvsKmeansTypeKmeans and cuvsKmeansTypeKmeansBalanced ?

(I realize that we haven't been doing this for most of our enums so far, but I think this is something that we should change)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also - we are using a boolean parameter for this distinction in the kMeansParams object

/**
* Whether to use hierarchical (balanced) kmeans or not
*/
bool hierarchical;

and i feel like we either should be using an enum for both or using a bool for both to be consistent

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree we should change these now before they get baked into the ABI and we have to wait ~6 months

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, I created #1717 to track that problem

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the bool vs. enum I much prefer the enum since it would support much better adding a third algorithm for KMeans. But I don't want to modify kmeans codebase in this PR since its scope is already big enough


/**
* @}
*/
Expand Down
3 changes: 2 additions & 1 deletion c/include/cuvs/core/all.h
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/*
* SPDX-FileCopyrightText: Copyright (c) 2025, NVIDIA CORPORATION.
* SPDX-FileCopyrightText: Copyright (c) 2025-2026, NVIDIA CORPORATION.
* SPDX-License-Identifier: Apache-2.0
*/

Expand Down Expand Up @@ -39,4 +39,5 @@
#endif

#include <cuvs/preprocessing/quantize/binary.h>
#include <cuvs/preprocessing/quantize/pq.h>
#include <cuvs/preprocessing/quantize/scalar.h>
10 changes: 9 additions & 1 deletion c/include/cuvs/preprocessing/quantize/binary.h
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/*
* SPDX-FileCopyrightText: Copyright (c) 2025, NVIDIA CORPORATION.
* SPDX-FileCopyrightText: Copyright (c) 2025-2026, NVIDIA CORPORATION.
* SPDX-License-Identifier: Apache-2.0
*/

Expand All @@ -13,6 +13,10 @@
extern "C" {
#endif

/**
* @defgroup preprocessing_c_binary C API for Binary Quantizer
* @{
*/
/**
* @brief In the cuvsBinaryQuantizerTransform function, a bit is set if the corresponding element in
* the dataset vector is greater than the corresponding element in the threshold vector. The mean
Expand Down Expand Up @@ -132,6 +136,10 @@ cuvsError_t cuvsBinaryQuantizerTransformWithParams(cuvsResources_t res,
DLManagedTensor* dataset,
DLManagedTensor* out);

/**
* @}
*/

#ifdef __cplusplus
}
#endif
220 changes: 220 additions & 0 deletions c/include/cuvs/preprocessing/quantize/pq.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,220 @@
/*
* SPDX-FileCopyrightText: Copyright (c) 2025-2026, NVIDIA CORPORATION.
* SPDX-License-Identifier: Apache-2.0
*/

#pragma once

#include <cuvs/cluster/kmeans.h>
#include <cuvs/core/c_api.h>
#include <dlpack/dlpack.h>
#include <stdint.h>

#ifdef __cplusplus
extern "C" {
#endif

/**
* @defgroup preprocessing_c_pq C API for Product Quantizer
* @{
*/
/**
* @brief Product quantizer parameters.
*/
struct cuvsProductQuantizerParams {
/**
* The bit length of the vector element after compression by PQ.
*
* Possible values: within [4, 16].
*
* Hint: the smaller the 'pq_bits', the smaller the index size and the better the search
* performance, but the lower the recall.
*/
uint32_t pq_bits;
/**
* The dimensionality of the vector after compression by PQ.
* When zero, an optimal value is selected using a heuristic.
*
* TODO: at the moment `dim` must be a multiple `pq_dim`.
*/
uint32_t pq_dim;
/**
* Vector Quantization (VQ) codebook size - number of "coarse cluster centers".
* When zero, an optimal value is selected using a heuristic.
* When one, only product quantization is used.
*/
uint32_t vq_n_centers;
/** The number of iterations searching for kmeans centers (both VQ & PQ phases). */
uint32_t kmeans_n_iters;
/**
* The fraction of data to use during iterative kmeans building (VQ phase).
* When zero, an optimal value is selected using a heuristic.
*/
double vq_kmeans_trainset_fraction;
/**
* The fraction of data to use during iterative kmeans building (PQ phase).
* When zero, an optimal value is selected using a heuristic.
*/
double pq_kmeans_trainset_fraction;
Comment thread
lowener marked this conversation as resolved.
Outdated
/**
* The type of kmeans algorithm to use for PQ training.
*/
cuvsKMeansType pq_kmeans_type;
/**
* The max number of data points to use per PQ code during PQ codebook training. Using more data
* points per PQ code may increase the quality of PQ codebook but may also increase the build
* time. We will use `pq_n_centers * max_train_points_per_pq_code` training
* points to train each PQ codebook.
*/
uint32_t max_train_points_per_pq_code;
/**
* Whether to use Vector Quantization (KMeans) before product quantization (PQ).
* When true, VQ is used before PQ. When false, only product quantization is used.
*/
bool use_vq;
/**
* Whether to use subspaces for product quantization (PQ).
* When true, one PQ codebook is used for each subspace. Otherwise, a single
* PQ codebook is used.
*/
bool use_subspaces;
};

typedef struct cuvsProductQuantizerParams* cuvsProductQuantizerParams_t;

/**
* @brief Allocate Product Quantizer params, and populate with default values
*
* @param[in] params cuvsProductQuantizerParams_t to allocate
* @return cuvsError_t
*/
cuvsError_t cuvsProductQuantizerParamsCreate(cuvsProductQuantizerParams_t* params);

/**
* @brief De-allocate Product Quantizer params
*
* @param[in] params
* @return cuvsError_t
*/
cuvsError_t cuvsProductQuantizerParamsDestroy(cuvsProductQuantizerParams_t params);

/**
* @brief Defines and stores product quantizer upon training
*
* The quantization is performed by a linear mapping of an interval in the
* float data type to the full range of the quantized int type.
*/
typedef struct {
uintptr_t addr;
DLDataType dtype;
} cuvsProductQuantizer;

typedef cuvsProductQuantizer* cuvsProductQuantizer_t;

/**
* @brief Allocate Product Quantizer
*
* @param[in] quantizer cuvsProductQuantizer_t to allocate
* @return cuvsError_t
*/
cuvsError_t cuvsProductQuantizerCreate(cuvsProductQuantizer_t* quantizer);

/**
* @brief De-allocate Product Quantizer
*
* @param[in] quantizer
* @return cuvsError_t
*/
cuvsError_t cuvsProductQuantizerDestroy(cuvsProductQuantizer_t quantizer);

/**
* @brief Trains a product quantizer to be used later for quantizing the dataset.
*
* @param[in] res raft resource
* @param[in] params Parameters for product quantizer training
* @param[in] dataset a row-major host or device matrix
* @param[out] quantizer trained product quantizer
*/
cuvsError_t cuvsProductQuantizerTrain(cuvsResources_t res,
cuvsProductQuantizerParams_t params,
DLManagedTensor* dataset,
cuvsProductQuantizer_t quantizer);

/**
* @brief Applies product quantization transform to the given dataset
*
* This applies product quantization to a dataset.
*
* @param[in] res raft resource
* @param[in] quantizer product quantizer
* @param[in] dataset a row-major host or device matrix to transform
* @param[out] out a row-major device matrix to store transformed data
*/
cuvsError_t cuvsProductQuantizerTransform(cuvsResources_t res,
cuvsProductQuantizer_t quantizer,
DLManagedTensor* dataset,
DLManagedTensor* out);

/**
* @brief Applies product quantization inverse transform to the given quantized codes
*
* This applies product quantization inverse transform to the given quantized codes.
*
* @param[in] res raft resource
* @param[in] quantizer product quantizer
* @param[in] codes a row-major device matrix of quantized codes
* @param[out] out a row-major device matrix to store the original data
*/
cuvsError_t cuvsProductQuantizerInverseTransform(cuvsResources_t res,
cuvsProductQuantizer_t quantizer,
DLManagedTensor* codes,
DLManagedTensor* out);

/**
* @brief Get the bit length of the vector element after compression by PQ.
*
* @param[in] quantizer product quantizer
* @param[out] pq_bits bit length of the vector element after compression by PQ
*/
cuvsError_t cuvsProductQuantizerGetPqBits(cuvsProductQuantizer_t quantizer, uint32_t* pq_bits);

/**
* @brief Get the dimensionality of the vector after compression by PQ.
*
* @param[in] quantizer product quantizer
* @param[out] pq_dim dimensionality of the vector after compression by PQ
*/
cuvsError_t cuvsProductQuantizerGetPqDim(cuvsProductQuantizer_t quantizer, uint32_t* pq_dim);
Comment thread
lowener marked this conversation as resolved.

/**
* @brief Get the PQ codebook.
*
* @param[in] quantizer product quantizer
* @param[out] pq_codebook PQ codebook
*/
cuvsError_t cuvsProductQuantizerGetPqCodebook(cuvsProductQuantizer_t quantizer,
DLManagedTensor* pq_codebook);

/**
* @brief Get the VQ codebook.
*
* @param[in] quantizer product quantizer
* @param[out] vq_codebook VQ codebook
*/
cuvsError_t cuvsProductQuantizerGetVqCodebook(cuvsProductQuantizer_t quantizer,
DLManagedTensor* vq_codebook);
/**
* @brief Get the encoded dimension of the quantized dataset.
*
* @param[in] quantizer product quantizer
* @param[out] encoded_dim encoded dimension of the quantized dataset
*/
cuvsError_t cuvsProductQuantizerGetEncodedDim(cuvsProductQuantizer_t quantizer,
uint32_t* encoded_dim);

/**
* @}
*/
#ifdef __cplusplus
}
#endif
10 changes: 9 additions & 1 deletion c/include/cuvs/preprocessing/quantize/scalar.h
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/*
* SPDX-FileCopyrightText: Copyright (c) 2025, NVIDIA CORPORATION.
* SPDX-FileCopyrightText: Copyright (c) 2025-2026, NVIDIA CORPORATION.
* SPDX-License-Identifier: Apache-2.0
*/

Expand All @@ -13,6 +13,10 @@
extern "C" {
#endif

/**
* @defgroup preprocessing_c_scalar C API for Scalar Quantizer
* @{
*/
/**
* @brief Scalar quantizer parameters.
*/
Expand Down Expand Up @@ -114,6 +118,10 @@ cuvsError_t cuvsScalarQuantizerInverseTransform(cuvsResources_t res,
DLManagedTensor* dataset,
DLManagedTensor* out);

/**
* @}
*/

#ifdef __cplusplus
}
#endif
Loading
Loading