Skip to content

Remove JIT+LTO fragment database#1927

Merged
rapids-bot[bot] merged 4 commits intorapidsai:mainfrom
KyleFromNVIDIA:jit-lto-remove-fragment-database
Mar 17, 2026
Merged

Remove JIT+LTO fragment database#1927
rapids-bot[bot] merged 4 commits intorapidsai:mainfrom
KyleFromNVIDIA:jit-lto-remove-fragment-database

Conversation

@KyleFromNVIDIA
Copy link
Copy Markdown
Member

Rather than register each fragment in a runtime class with a string key, "register" them with the linker using template specialization. This solves a number of problems:

  1. It simplifies the code by removing the FragmentDatabase class.
  2. It addresses Use C linkage for JIT LTO kernels #1909 (comment) by bypassing the issue entirely. There is no longer a need to build the fragment name string at runtime.
  3. For clients that use the cuvs_static static library, it allows the linker to pick and choose which fragment symbols it needs rather than including all of them with every client just in case any of them are needed.
  4. Since there is no longer a need for $<WHOLE_ARCHIVE:...> linkage, there is no need for the cuvs_jit_lto_kernels target at all, thus simplifying the CMake code too.

Rather than register each fragment in a runtime class with a string
key, "register" them with the linker using template specialization. This
solves a number of problems:

1. It simplifies the code by removing the `FragmentDatabase` class.
2. It addresses rapidsai#1909 (comment)
   by bypassing the issue entirely. There is no longer a need to build
   the fragment name string at runtime.
3. For clients that use the `cuvs_static` static library, it allows the
   linker to pick and choose which fragment symbols it needs rather than
   including all of them with every client just in case any of them are
   needed.
4. Since there is no longer a need for `$<WHOLE_ARCHIVE:...>` linkage,
   there is no need for the `cuvs_jit_lto_kernels` target at all, thus
   simplifying the CMake code too.
@KyleFromNVIDIA KyleFromNVIDIA requested review from a team as code owners March 17, 2026 13:39
@KyleFromNVIDIA KyleFromNVIDIA added improvement Improves an existing functionality non-breaking Introduces a non-breaking change labels Mar 17, 2026
Copy link
Copy Markdown
Contributor

@robertmaynard robertmaynard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this interact with user defined kernels/functions? Do we lose any support for that in the future?

@KyleFromNVIDIA
Copy link
Copy Markdown
Member Author

We might have to rework it a bit, but it should still be supported. The UDF library takes the user-provided code, compiles it into a fragment, and calls AlgorithmPlanner::add_fragment(const FragmentEntry&). We can add a UDFFatbinFragmentEntry class to support this.

Comment thread cpp/src/neighbors/ivf_flat/ivf_flat_interleaved_scan_jit.cuh
@KyleFromNVIDIA KyleFromNVIDIA changed the base branch from release/26.04 to main March 17, 2026 15:36
@KyleFromNVIDIA KyleFromNVIDIA added breaking Introduces a breaking change and removed non-breaking Introduces a non-breaking change labels Mar 17, 2026
@KyleFromNVIDIA
Copy link
Copy Markdown
Member Author

/merge

@rapids-bot rapids-bot Bot merged commit 0b81b85 into rapidsai:main Mar 17, 2026
80 checks passed
lowener pushed a commit to lowener/cuvs that referenced this pull request Mar 30, 2026
Rather than register each fragment in a runtime class with a string key, "register" them with the linker using template specialization. This solves a number of problems:

1. It simplifies the code by removing the `FragmentDatabase` class.
2. It addresses rapidsai#1909 (comment) by bypassing the issue entirely. There is no longer a need to build the fragment name string at runtime.
3. For clients that use the `cuvs_static` static library, it allows the linker to pick and choose which fragment symbols it needs rather than including all of them with every client just in case any of them are needed.
4. Since there is no longer a need for `$<WHOLE_ARCHIVE:...>` linkage, there is no need for the `cuvs_jit_lto_kernels` target at all, thus simplifying the CMake code too.

Authors:
  - Kyle Edwards (https://github.com/KyleFromNVIDIA)

Approvers:
  - Divye Gala (https://github.com/divyegala)

URL: rapidsai#1927
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

breaking Introduces a breaking change improvement Improves an existing functionality

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

3 participants