-
-
Notifications
You must be signed in to change notification settings - Fork 130
Avoid Rational in activation function gradients
#399
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Could you fix the definition of For example here is my suggestion. deriv_hardswish(x) = ifelse(x < -3, oftf(x, 0), ifelse(x > 3, oftf(x, 1), x / 3 + oftf(x, 1 / 2))) |
|
Maybe I got all of them this time... |
| end | ||
| @testset verbose=true "NNlib.jl" begin | ||
| if CUDA.functional() | ||
| if get(ENV, "NNLIB_TEST_CUDA", "false") == "true" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW all this mess is because I thought it was sometimes failing first on another test, and hence not showing the problem. So I added an overall testset. But it turns out this doesn't matter, because NNlibCUDA already has such a testset, and that's where both problems were.
Anyway, so it's unrelated, but perhaps a good idea. I also pulled the CUDA tests first. Since these aren't always run, it's nice to find out immediately whether they are going to be run at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But easy to revert if someone thinks it ought not to be in this PR.
|
GPU test failures in Nightly test failure is still #396 |
This avoids using rational numbers in some activation function gradients, as that causes problems on GPU.
Closes #398, closes #400.