Skip to content

Conversation

@cncastillo
Copy link

I added NVIDIA’s sincos symbols to the existing sincos CallPattern:

def : CallPattern<(Op $x),
  ["__fd_sincos_1", "__fd_sincos_1f", "__fd_sincos_1l",
   "__nv_sincos", "__nv_sincosf"],

Fixes #2552.

However, I am not sure this is actually correct, ChatGPT says the ABIs differ (below):


Current Enzyme test uses a struct‑return function, e.g.:

declare [2 x double] @__fd_sincos_1(double)

whereas the NVIDIA libdevice intrinsics are pointer‑out, void return:

__device__ void __nv_sincos (double x, double *s, double *c);
__device__ void __nv_sincosf(float  x, float  *s, float  *c);

So with this change I am effectively binding symbols with a [2 x T] (T) ABI and symbols with a void (T, T*, T*) ABI to the same CallPattern. I am worried this might be wrong or at least fragile, since the existing derivative logic for sincos appears to assume a non‑void aggregate return (as in ReverseMode/sincos.ll).


I am not really sure how to test this, any pointers would be appreciated.

@wsmoses
Copy link
Member

wsmoses commented Dec 1, 2025

yeah so I think we need to perform a transformation to change it to be struct-return in order to use the tablegen utilities here, we need to define the rule manually in enzyme, or we need to upgrade non-blas tablegen to handle mutation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support for sincos NVIDIA intrinsic

2 participants