cuda.compute: Add select algorithm based on three_way_partition#6766
Merged
shwina merged 4 commits intoNVIDIA:mainfrom Nov 25, 2025
Merged
cuda.compute: Add select algorithm based on three_way_partition#6766shwina merged 4 commits intoNVIDIA:mainfrom
shwina merged 4 commits intoNVIDIA:mainfrom
Conversation
NaderAlAwar
approved these changes
Nov 24, 2025
Contributor
NaderAlAwar
left a comment
There was a problem hiding this comment.
Approved with the expectation that we will replace this when we port select_if to cuda.compute
e7f8680 to
643a341
Compare
2 tasks
Contributor
🥳 CI Workflow Results🟩 Finished in 1h 27m: Pass: 100%/48 | Total: 15h 01m | Max: 51m 06sSee results here. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
This PR adds an implementation of
selectthat is based onthree_way_partition. Longer term, we want to bind directly tocub::DeviceSelect.It works by simply discarding the unselected portions of
three_way_partition. As part of this PR, I needed to add a fix toDiscardIterator, which always defaulted to using a value type ofuint8. This doesn't always work, as the value type must match the expected output type depending on the algorithm. Especially for struct types, no implicit conversion is usually possible so it must be explicitly provided.Checklist