-
-
Notifications
You must be signed in to change notification settings - Fork 11.8k
Use macro guard CUDA functions for back compatibility in grouped_topk_kernel.cu #25346
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@minosfuture has exported this pull request. If you are a Meta employee, you can view the originating diff in D82918389. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
The pull request introduces a compatibility wrapper is_finite to handle differences in isfinite support across CUDA versions. This is a good approach to maintain backward compatibility. My feedback focuses on improving the implementation of this new function for better robustness and readability.
| template <typename T> | ||
| __device__ inline bool is_finite(const T val) { | ||
| #if (__CUDACC_VER_MAJOR__ * 10000 + __CUDACC_VER_MINOR__ * 100 >= 120800) | ||
| bool res = cuda::std::isfinite(val); | ||
| #else | ||
| bool res = isfinite(cuda_cast<float, T>(val)); | ||
| #endif | ||
| return res; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The implementation of is_finite can be improved for robustness and conciseness.
- The preprocessor check for the CUDA version can be made more robust and standard by using the
__CUDACC_VER__macro. This macro combines major, minor, and patch versions into a single integer (e.g.,120800for 12.8.0), which makes the check cleaner and correctly handles patch versions if needed in the future. The current check only considers major and minor versions. - The function body can be simplified by directly returning the result of the
isfinitecalls within the#if/#elsebranches, removing the need for the intermediateresvariable.
Here is a suggested improved version:
template <typename T>
__device__ inline bool is_finite(const T val) {
#if defined(__CUDACC_VER__) && __CUDACC_VER__ >= 120800
return cuda::std::isfinite(val);
#else
return isfinite(cuda_cast<float, T>(val));
#endif
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CUDACC_VER should be deprecated, rigth?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the training data of gemini needs some refreshing :D
da22c1c to
c68fe6f
Compare
|
@minosfuture has exported this pull request. If you are a Meta employee, you can view the originating diff in D82918389. |
…_kernel.cu (vllm-project#25346) Summary: cuda::std::isfinite is not available with earlier CUDA versions. We guard it with macros and extract a device function for is_finite. Test Plan: build with 12.4 and 12.8 Reviewed By: houseroad Differential Revision: D82918389 Privacy Context Container: L1370295 Signed-off-by: Ming Yang <[email protected]>
…ject#25250) Signed-off-by: Rahul Tuli <[email protected]> Co-authored-by: Claude <[email protected]>
Signed-off-by: Ming Yang <[email protected]>
3b5ebfd to
ccabe13
Compare
yewentao256
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for the work!
…_kernel.cu (vllm-project#25346) Signed-off-by: Ming Yang <[email protected]> Signed-off-by: Rahul Tuli <[email protected]> Co-authored-by: Rahul Tuli <[email protected]> Co-authored-by: Claude <[email protected]> Co-authored-by: Wentao Ye <[email protected]> Co-authored-by: Lu Fang <[email protected]> Co-authored-by: Ye (Charlotte) Qi <[email protected]>
…_kernel.cu (#25346) Signed-off-by: Ming Yang <[email protected]> Signed-off-by: Rahul Tuli <[email protected]> Co-authored-by: Rahul Tuli <[email protected]> Co-authored-by: Claude <[email protected]> Co-authored-by: Wentao Ye <[email protected]> Co-authored-by: Lu Fang <[email protected]> Co-authored-by: Ye (Charlotte) Qi <[email protected]> Signed-off-by: yewentao256 <[email protected]>
…_kernel.cu (vllm-project#25346) Signed-off-by: Ming Yang <[email protected]> Signed-off-by: Rahul Tuli <[email protected]> Co-authored-by: Rahul Tuli <[email protected]> Co-authored-by: Claude <[email protected]> Co-authored-by: Wentao Ye <[email protected]> Co-authored-by: Lu Fang <[email protected]> Co-authored-by: Ye (Charlotte) Qi <[email protected]> Signed-off-by: gaojc <[email protected]>
…_kernel.cu (vllm-project#25346) Signed-off-by: Ming Yang <[email protected]> Signed-off-by: Rahul Tuli <[email protected]> Co-authored-by: Rahul Tuli <[email protected]> Co-authored-by: Claude <[email protected]> Co-authored-by: Wentao Ye <[email protected]> Co-authored-by: Lu Fang <[email protected]> Co-authored-by: Ye (Charlotte) Qi <[email protected]> Signed-off-by: xuebwang-amd <[email protected]>
…_kernel.cu (vllm-project#25346) Signed-off-by: Ming Yang <[email protected]> Signed-off-by: Rahul Tuli <[email protected]> Co-authored-by: Rahul Tuli <[email protected]> Co-authored-by: Claude <[email protected]> Co-authored-by: Wentao Ye <[email protected]> Co-authored-by: Lu Fang <[email protected]> Co-authored-by: Ye (Charlotte) Qi <[email protected]>
…_kernel.cu (vllm-project#25346) Signed-off-by: Ming Yang <[email protected]> Signed-off-by: Rahul Tuli <[email protected]> Co-authored-by: Rahul Tuli <[email protected]> Co-authored-by: Claude <[email protected]> Co-authored-by: Wentao Ye <[email protected]> Co-authored-by: Lu Fang <[email protected]> Co-authored-by: Ye (Charlotte) Qi <[email protected]>
…_kernel.cu (vllm-project#25346) Signed-off-by: Ming Yang <[email protected]> Signed-off-by: Rahul Tuli <[email protected]> Co-authored-by: Rahul Tuli <[email protected]> Co-authored-by: Claude <[email protected]> Co-authored-by: Wentao Ye <[email protected]> Co-authored-by: Lu Fang <[email protected]> Co-authored-by: Ye (Charlotte) Qi <[email protected]> Signed-off-by: xuebwang-amd <[email protected]>
…_kernel.cu (vllm-project#25346) Signed-off-by: Ming Yang <[email protected]> Signed-off-by: Rahul Tuli <[email protected]> Co-authored-by: Rahul Tuli <[email protected]> Co-authored-by: Claude <[email protected]> Co-authored-by: Wentao Ye <[email protected]> Co-authored-by: Lu Fang <[email protected]> Co-authored-by: Ye (Charlotte) Qi <[email protected]>
Summary:
cuda::std::isfinite is not available with earlier CUDA versions. We guard it
with macros and extract a device function for is_finite.
Test Plan: build with 12.4 and 12.8
Reviewed By: houseroad
Differential Revision: D82918389