Conversation
|
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
operator_propertiesoperator_properties
miscco
left a comment
There was a problem hiding this comment.
I believe all of the assumptions here are only valid for arithmetic types
bernhardmgruber
left a comment
There was a problem hiding this comment.
Please merge this new functionality with the existing cub::detail::identity_v
operator_properties
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
| template <class _Op, class _Tp, class Enable = void> | ||
| struct absorbing_element | ||
| {}; |
There was a problem hiding this comment.
Strong suggestion: I think we should really aim to make the APIs for absorbing and identity element variable templates. The API would look and feel just a lot better if I could write cuda::identity_element<cuda::maximum, int>. Then the query and the template to specialize for any user provided type would also be the same entity.
For the non-existing case, you could just add a marker type:
struct __no_element{};
template <class _Op, class _Tp, class Enable = void>
inline constexpr auto absorbing_element = __no_element;
... // all your specializations
template <class _Op, class _Tp>
inline constexpr bool has_absorbing_element_v = !::cuda::std::is_same_v<absorbing_element<_Op, _Tp>, __no_element>;There was a problem hiding this comment.
I would like to second that.
We already use a constexpr function for is_associative so we should do the same, that would always leave the inline variable defined and we can provide a clear error message
There was a problem hiding this comment.
I would prefer constexpr template variable too. The problem is that they are not supported on Windows for all types that have conversion operator, like __half. Please see the internal discussion with the compiler team.
There was a problem hiding this comment.
I tried a middle ground. Use inline constexpr for all traits, except for __half and __nv_bfloat16
| template <class _Op, class _Tp, class Enable = void> | ||
| struct absorbing_element | ||
| {}; |
There was a problem hiding this comment.
I would like to second that.
We already use a constexpr function for is_associative so we should do the same, that would always leave the inline variable defined and we can provide a clear error message
This comment has been minimized.
This comment has been minimized.
| template <class _Tp> | ||
| _CCCL_GLOBAL_CONSTANT auto | ||
| __absorbing_element_v<::cuda::maximum<>, | ||
| _Tp, | ||
| ::cuda::std::enable_if_t<::cuda::std::__is_cv_extended_floating_point_v<_Tp>>> = | ||
| __absorbing_element_v<::cuda::maximum<_Tp>, _Tp>; |
There was a problem hiding this comment.
Looking at this more, I am strongly considering to move the values to functions
That would make sure that they are available even for __half and friends and also allow us to avoid storing them somewhere
There was a problem hiding this comment.
Remark: As a user, I don't care how it's implemented, but I would like to spell the public API as a variable template: cuda::absorbing_element_v<Op, T>.
There was a problem hiding this comment.
we all agree that variable template is the best solution but it doesn't work in multiple cases, for example __half.
I'm also considering contexpr functions because it solves all issues
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
/ok to test d587bca |
1 similar comment
|
/ok to test d587bca |
This comment has been minimized.
This comment has been minimized.
🥳 CI Workflow Results🟩 Finished in 16h 05m: Pass: 100%/148 | Total: 9d 18h | Max: 5h 22m | Hits: 24%/318269See results here. |
Co-authored-by: Michael Schellenberger Costa <miscco@nvidia.com>
Description
Address #7173
The PR provides a standardized way to query various properties of operators. For example,
cuda::std::plusis associative withunsignedbut not forfloat.