IVF-PQ coarse search: fix integer overflow and avoid excessive batch sizes#999
Merged
rapids-bot[bot] merged 12 commits intorapidsai:branch-25.08from Jul 16, 2025
Merged
Conversation
tfeher
requested changes
Jul 14, 2025
Contributor
tfeher
left a comment
There was a problem hiding this comment.
Thanks Artem for the fix, I have left a few suggestions below.
tfeher
approved these changes
Jul 16, 2025
Contributor
tfeher
left a comment
There was a problem hiding this comment.
Thanks Artem for the update LGTM!
Contributor
|
/merge |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
When the IVF-PQ batch size and the number of cluster are big enough, the GEMM output matrix size may exceed the
uint32_tlimits. This manifests as an illegal memory access popping up in further stages of IVF-PQ (due to everything being as async as possible).This PR fixes the bug by converting the input arguments to 64-bits before multiplying them.
On top of the overflow fix, this PR also limits the batch size when it may hurt performance: (1) within IVF-PQ to avoid out-of-memory error (2) in CAGRA graph build to avoid switching to the large memory resource (which is typically backed by managed memory).