Skip to content

Report NVRTC builtin operation failures to the user#196

Merged
kkraus14 merged 1 commit intoNVIDIA:mainfrom
gmarkall:nvrtc-builtins-failure
Apr 11, 2025
Merged

Report NVRTC builtin operation failures to the user#196
kkraus14 merged 1 commit intoNVIDIA:mainfrom
gmarkall:nvrtc-builtins-failure

Conversation

@gmarkall
Copy link
Copy Markdown
Contributor

If NVRTC could not load its builtins library, then the user would get an error message like:

numba.cuda.cudadrv.error.NvrtcError:
  Failed to call nvrtcCompileProgram: NVRTC_ERROR_BUILTIN_OPERATION_FAILURE

With this change, we specifically transform this error value into a corresponding exception, and provide the user with the log from NVRTC so they can more easily diagnose the issue.

They will now see an error like:

numba.cuda.cudadrv.error.NvrtcError: NVRTC Compilation failure whilst compiling memsys.cu:

nvrtc: error: failed to open libnvrtc-builtins.so.12.8.
  Make sure that libnvrtc-builtins.so.12.8 is installed correctly.

which clearly explains the issue.

@gmarkall
Copy link
Copy Markdown
Contributor Author

cc @jiel-nv - this should make it clear if you encounter this issue in future.

If NVRTC could not load its builtins library, then the user would get an
error message like:

```
numba.cuda.cudadrv.error.NvrtcError:
  Failed to call nvrtcCompileProgram: NVRTC_ERROR_BUILTIN_OPERATION_FAILURE
```

With this change, we specifically transform this error value into a
corresponding exception, and provide the user with the log from NVRTC so
they can more easily diagnose the issue.

They will now see an error like:

```
numba.cuda.cudadrv.error.NvrtcError: NVRTC Compilation failure whilst compiling memsys.cu:

nvrtc: error: failed to open libnvrtc-builtins.so.12.8.
  Make sure that libnvrtc-builtins.so.12.8 is installed correctly.
```

which clearly explains the issue.
@gmarkall gmarkall force-pushed the nvrtc-builtins-failure branch from 22db01c to 07d16c4 Compare April 11, 2025 13:47
@gmarkall gmarkall added the 2 - In Progress Currently a work in progress label Apr 11, 2025
Copy link
Copy Markdown
Contributor

@kkraus14 kkraus14 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would there be any easy way to test this?

@gmarkall
Copy link
Copy Markdown
Contributor Author

I can't think of an easy way to test this. I tested it manually by renaming the libnvrtc-builtins.so.12.8 file. I can't think of a test of this that doesn't also involve mocking / modifying the NvrtcProgram class so much I'm not sure we'd be meaningfully testing what's there.

@kkraus14 kkraus14 merged commit fcf4e49 into NVIDIA:main Apr 11, 2025
35 checks passed
gmarkall added a commit to gmarkall/numba-cuda that referenced this pull request Apr 22, 2025
- Locate nvvm, libdevice, nvrtc, and cudart from nvidia-*-cu12 wheels (NVIDIA#155)
- reinstate test (NVIDIA#226)
- Restore PR NVIDIA#185 (Stop Certain Driver API Discovery for "v2") (NVIDIA#223)
- Report NVRTC builtin operation failures to the user (NVIDIA#196)
- Add Module Setup and Teardown Callback to Linkable Code Interface (NVIDIA#145)
- Test CUDA 12.8. (NVIDIA#187)
- Ensure RTC Bindings Clamp to the Maximum Supported CC (NVIDIA#189)
- Migrate code style to ruff (NVIDIA#170)
- Use less GPU memory in test_managed_alloc_driver_undersubscribe. (NVIDIA#188)
- Update workflows to always use proxy cache. (NVIDIA#191)
@gmarkall gmarkall mentioned this pull request Apr 22, 2025
gmarkall added a commit that referenced this pull request Apr 22, 2025
- Locate nvvm, libdevice, nvrtc, and cudart from nvidia-*-cu12 wheels (#155)
- reinstate test (#226)
- Restore PR #185 (Stop Certain Driver API Discovery for "v2") (#223)
- Report NVRTC builtin operation failures to the user (#196)
- Add Module Setup and Teardown Callback to Linkable Code Interface (#145)
- Test CUDA 12.8. (#187)
- Ensure RTC Bindings Clamp to the Maximum Supported CC (#189)
- Migrate code style to ruff (#170)
- Use less GPU memory in test_managed_alloc_driver_undersubscribe. (#188)
- Update workflows to always use proxy cache. (#191)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

2 - In Progress Currently a work in progress

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants