Skip to content

Add support for PQ preprocessing API#1278

Merged
rapids-bot[bot] merged 90 commits intorapidsai:release/26.02from
lowener:25.10-pq-preprocessing
Jan 23, 2026
Merged

Add support for PQ preprocessing API#1278
rapids-bot[bot] merged 90 commits intorapidsai:release/26.02from
lowener:25.10-pq-preprocessing

Conversation

@lowener
Copy link
Copy Markdown
Contributor

@lowener lowener commented Aug 23, 2025

Closes #107

This PR adds support for a PQ preprocessing API. It gives access to train(), transform() and inverse_transform() function that can be used to transform a dataset into PQ codes. It is re-using the VPQ functions from CAGRA-Q.

@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented Aug 23, 2025

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@lowener lowener added feature request New feature or request non-breaking Introduces a non-breaking change C++ labels Aug 25, 2025
Comment thread cpp/include/cuvs/preprocessing/quantize/product.hpp Outdated
Comment thread cpp/include/cuvs/preprocessing/quantize/product.hpp Outdated
Comment thread cpp/include/cuvs/preprocessing/quantize/product.hpp Outdated
Comment thread cpp/src/preprocessing/quantize/detail/product.cuh Outdated
Comment thread cpp/src/preprocessing/quantize/detail/product.cuh Outdated
Comment thread cpp/src/preprocessing/quantize/product.cu Outdated
Comment thread cpp/tests/preprocessing/product_quantization.cu Outdated
Comment thread cpp/include/cuvs/preprocessing/quantize/product.hpp Outdated
@cjnolet cjnolet moved this from Todo to In Progress in Unstructured Data Processing Sep 5, 2025
Comment thread cpp/src/preprocessing/quantize/detail/product.cuh Outdated
Comment thread cpp/include/cuvs/preprocessing/quantize/product.hpp Outdated
@lowener lowener marked this pull request as ready for review September 24, 2025 16:09
@lowener lowener requested review from a team as code owners September 24, 2025 16:09
Copy link
Copy Markdown
Member

@KyleFromNVIDIA KyleFromNVIDIA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved trivial CMake changes

aamijar and others added 12 commits January 23, 2026 09:08
Instead of using `cuvs::neighbors::brute_force::build` and `cuvs::neighbors::brute_force::search` we can consolidate to use the `cuvs::neighbors::all_neighbors::build` API.

Authors:
  - Anupam (https://github.com/aamijar)

Approvers:
  - Tarang Jain (https://github.com/tarang-jain)
  - Jinsol Park (https://github.com/jinsolp)
  - Victor Lafargue (https://github.com/viclafargue)

URL: rapidsai#1693
Closes rapidsai#1577

Reduces binary size by deduplicating `calc_chunk_indices_kernel`. This PR reduces instantiations from 62 -> 1 for each template (`BlockDim`=32, 64, ..., 1024)

### Binary Size Changes
CUDA 12.9: 1096.15MB -> 
CUDA 13: 432.98 MB->

Authors:
  - Jinsol Park (https://github.com/jinsolp)

Approvers:
  - Divye Gala (https://github.com/divyegala)
  - Robert Maynard (https://github.com/robertmaynard)

URL: rapidsai#1657
…rapidsai#1710)

Contributes to rapidsai/build-planning#242

Modifying `ci/build_wheel.sh` to

- pass`--build-constraint="${PIP_CONSTRAINT}"` unless build isolation is enabled.
- unset `PIP_CONSTRAINT` (set by rapids-init-pip)... it doesn't affect builds as of pip 25.3, and results in an error from `pip wheel` when set and `--build-constraint` is also passed

Authors:
  - Mike McCarty (https://github.com/mmccarty)

Approvers:
  - James Lamb (https://github.com/jameslamb)

URL: rapidsai#1710
Signed-off-by: Mickael Ide <[email protected]>
Signed-off-by: Mickael Ide <[email protected]>
Signed-off-by: Mickael Ide <[email protected]>
Signed-off-by: Mickael Ide <[email protected]>
…ai#1708)

Based on the in-development cuVS ABI stability docs we now encode the C API's ABI stability in the SOVERSION

Authors:
  - Robert Maynard (https://github.com/robertmaynard)

Approvers:
  - Bradley Dice (https://github.com/bdice)
  - Dante Gama Dessavre (https://github.com/dantegd)

URL: rapidsai#1708
Correct the libcuvs_c.so ABI major to start at 26.02

Authors:
  - Robert Maynard (https://github.com/robertmaynard)

Approvers:
  - Dante Gama Dessavre (https://github.com/dantegd)

URL: rapidsai#1724
Signed-off-by: Mickael Ide <[email protected]>
@lowener lowener force-pushed the 25.10-pq-preprocessing branch from 70a2564 to 809bacb Compare January 23, 2026 17:11
@mythrocks
Copy link
Copy Markdown
Contributor

mythrocks commented Jan 23, 2026

I don't see changes to the Java code here, so I'm reviewing these from the ABI perspective.

Will we be addressing this comment about enum names, in this PR/release? I ask because I didn't see a PR attached to #1717.

@cjnolet cjnolet removed request for a team and AyodeAwe January 23, 2026 19:08
@lowener
Copy link
Copy Markdown
Contributor Author

lowener commented Jan 23, 2026

@mythrocks A force push triggered a request for re-review on all files.
The comment about ABI changes was addressed, and now the enums introduced in this PR have a name more specific that is not likely to cause symbol collision.
The issue #1717 is to fix it accross all the codebase but it is not within the scope of this PR

@cjnolet
Copy link
Copy Markdown
Member

cjnolet commented Jan 23, 2026

/merge

}

/**
* @brief Applies inverse quantization transform to given dataset
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* @brief Applies inverse quantization transform to given dataset
* @brief Applies inverse quantization transform to given set of encoded vectors

Comment thread c/include/cuvs/cluster/kmeans.h Outdated
/**
* @brief Type of k-means algorithm.
*/
typedef enum { KMeans = 0, KMeansBalanced = 1 } cuvsKMeansType;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Future task- we should update cuvsKMeansParams_t to use this and remove the bool hierarchical flag. Can you create an issue for this just so we don't forget it? It doesn't have to be done now, especially since we're striving to maintain ABI compatibility in our C APIs (we need to start deprecating breaking changes).

@rapids-bot rapids-bot Bot merged commit d1b11d5 into rapidsai:release/26.02 Jan 23, 2026
191 of 193 checks passed
@github-project-automation github-project-automation Bot moved this from In Progress to Done in Unstructured Data Processing Jan 23, 2026
rapids-bot Bot pushed a commit that referenced this pull request Jan 24, 2026
This PR brings new params to ivf_pq: an option for the user to choose the layout of the ivf lists. The lists can be flat (no interleaving) or interleaved (current default). Flat codes allows building the index in a CPU-compatible format.

[UPDATE as of 12/19/2025]:
After #1278 is merged, we can unify IVF-PQ and PQ API codepaths.

[UPDATE 01/08/2026]:
This PR can be merged before #1278. The flat code-writing can potentially be reverted once #1278 is merged (so we can later use the PQ preprocessing API directly). However that will come naturally as a part if a broader unification of IVF-PQ and PQ codepaths.

[Benchmarks 01/15/2026]:
## IVF-PQ Layout Benchmark Results

**Dataset**: 1,000,000 vectors × 128 dimensions | **pq_dim**: 32

pq_bits | Code Size | Direct FLAT Build (ms) | INTERLEAVED Build (ms) | Convert INTERLEAVED to FLAT with Codepacker (ms) | Total time for INTERLEAVED build + Conversion to FLAT with Codepacker (unpack) (ms) | Overhead |
|:-------:|:---------:|:---------------:|:----------------------:|:-------------------:|:----------------------:|:--------:|
| 8 | 32 bytes | 372.46 | 385.86 | 985.28 | 1371.13 | 3.68× |
| 6 | 24 bytes | 298.83 | 300.99 | 961.82 | 1262.82 | 4.23× |
| 5 | 20 bytes | 283.25 | 281.95 | 795.43 | 1077.38 | 3.80× |
| 4 | 16 bytes | 270.63 | 271.01 | 489.73 | 760.75 | 2.81× |

Authors:
  - Tarang Jain (https://github.com/tarang-jain)

Approvers:
  - Corey J. Nolet (https://github.com/cjnolet)
  - Robert Maynard (https://github.com/robertmaynard)

URL: #1607
@lowener lowener deleted the 25.10-pq-preprocessing branch January 26, 2026 11:00
lowener added a commit to lowener/cuvs that referenced this pull request Jan 26, 2026
Closes rapidsai#107

This PR adds support for a PQ preprocessing API. It gives access to `train()`, `transform()` and `inverse_transform()` function that can be used to transform a dataset into PQ codes. It is re-using the VPQ functions from CAGRA-Q.

Authors:
  - Micka (https://github.com/lowener)
  - Kyle Edwards (https://github.com/KyleFromNVIDIA)
  - Corey J. Nolet (https://github.com/cjnolet)
  - Anupam (https://github.com/aamijar)
  - Jinsol Park (https://github.com/jinsolp)
  - Jake Awe (https://github.com/AyodeAwe)
  - Mike McCarty (https://github.com/mmccarty)
  - Robert Maynard (https://github.com/robertmaynard)

Approvers:
  - Kyle Edwards (https://github.com/KyleFromNVIDIA)
  - Robert Maynard (https://github.com/robertmaynard)
  - Tamas Bela Feher (https://github.com/tfeher)
  - Corey J. Nolet (https://github.com/cjnolet)

URL: rapidsai#1278
lowener pushed a commit to lowener/cuvs that referenced this pull request Jan 26, 2026
This PR brings new params to ivf_pq: an option for the user to choose the layout of the ivf lists. The lists can be flat (no interleaving) or interleaved (current default). Flat codes allows building the index in a CPU-compatible format.

[UPDATE as of 12/19/2025]:
After rapidsai#1278 is merged, we can unify IVF-PQ and PQ API codepaths.

[UPDATE 01/08/2026]:
This PR can be merged before rapidsai#1278. The flat code-writing can potentially be reverted once rapidsai#1278 is merged (so we can later use the PQ preprocessing API directly). However that will come naturally as a part if a broader unification of IVF-PQ and PQ codepaths.

[Benchmarks 01/15/2026]:
## IVF-PQ Layout Benchmark Results

**Dataset**: 1,000,000 vectors × 128 dimensions | **pq_dim**: 32

pq_bits | Code Size | Direct FLAT Build (ms) | INTERLEAVED Build (ms) | Convert INTERLEAVED to FLAT with Codepacker (ms) | Total time for INTERLEAVED build + Conversion to FLAT with Codepacker (unpack) (ms) | Overhead |
|:-------:|:---------:|:---------------:|:----------------------:|:-------------------:|:----------------------:|:--------:|
| 8 | 32 bytes | 372.46 | 385.86 | 985.28 | 1371.13 | 3.68× |
| 6 | 24 bytes | 298.83 | 300.99 | 961.82 | 1262.82 | 4.23× |
| 5 | 20 bytes | 283.25 | 281.95 | 795.43 | 1077.38 | 3.80× |
| 4 | 16 bytes | 270.63 | 271.01 | 489.73 | 760.75 | 2.81× |

Authors:
  - Tarang Jain (https://github.com/tarang-jain)

Approvers:
  - Corey J. Nolet (https://github.com/cjnolet)
  - Robert Maynard (https://github.com/robertmaynard)

URL: rapidsai#1607
rapids-bot Bot pushed a commit that referenced this pull request Mar 13, 2026
Follow-up to the PQ PR #1278 .
Closes #1575 
Closes #1747
This PR removes the need to compile multiple times the same code for PQ in CAGRA-Q and SCANN, removing code duplication and improving build time.
CAGRA-Q can't use the new public API since it is using half for its math type so an private API function is used.

A small test is added to SCANN to make sure that the returned index is not complete garbage but more testing should be done there. (Created issue #1747 to track this)

This PR saves ~2-3 Mb on libcuvs.so compiled on a single architecture (141 Mb -> 138Mb)

Authors:
  - Micka (https://github.com/lowener)

Approvers:
  - Corey J. Nolet (https://github.com/cjnolet)

URL: #1746
lowener added a commit to lowener/cuvs that referenced this pull request Mar 30, 2026
Follow-up to the PQ PR rapidsai#1278 .
Closes rapidsai#1575 
Closes rapidsai#1747
This PR removes the need to compile multiple times the same code for PQ in CAGRA-Q and SCANN, removing code duplication and improving build time.
CAGRA-Q can't use the new public API since it is using half for its math type so an private API function is used.

A small test is added to SCANN to make sure that the returned index is not complete garbage but more testing should be done there. (Created issue rapidsai#1747 to track this)

This PR saves ~2-3 Mb on libcuvs.so compiled on a single architecture (141 Mb -> 138Mb)

Authors:
  - Micka (https://github.com/lowener)

Approvers:
  - Corey J. Nolet (https://github.com/cjnolet)

URL: rapidsai#1746
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

C++ feature request New feature or request non-breaking Introduces a non-breaking change

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.