Skip to content

Bug report: leakyrelu'.(CuArray(rand(Float32, 10))) fails [email protected] (@0.8.2 works) #398

@terasakisatoshi

Description

@terasakisatoshi

As title says it seems leakyrelu'.(CuArray(rand(Float32, 10))) fails for [email protected] (But it works for [email protected]).

Here is my report to reproduce this issue.

julia --project=@.
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.6.5 (2021-12-19)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

julia> versioninfo()
Julia Version 1.6.5
Commit 9058264a69 (2021-12-19 12:30 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  CPU: Intel(R) Xeon(R) CPU E5-2683 v3 @ 2.00GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-11.0.1 (ORCJIT, haswell)

julia> using Pkg; Pkg.status()
      Status `~/tmp/bugreport/083/Project.toml`
  [052768ef] CUDA v3.8.3
  [872c559c] NNlib v0.8.3
  [e88e6eb3] Zygote v0.6.35

julia> using Zygote, CUDA, NNlib

julia> xcpu = rand(Float32, 10)
10-element Vector{Float32}:
 0.29279244
 0.29443634
 0.7627337
 0.8952025
 0.53390884
 0.021265268
 0.5108856
 0.1079123
 0.24309349
 0.6105857

julia> leakyrelu'.(xcpu)
10-element Vector{Float32}:
 1.0
 1.0
 1.0
 1.0
 1.0
 1.0
 1.0
 1.0
 1.0
 1.0

julia> xgpu = CuArray(xcpu)
10-element CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}:
 0.29279244
 0.29443634
 0.7627337
 0.8952025
 0.53390884
 0.021265268
 0.5108856
 0.1079123
 0.24309349
 0.6105857

julia> leakyrelu'.(xgpu)
ERROR: InvalidIRError: compiling kernel broadcast_kernel(CUDA.CuKernelContext, CuDeviceVector{Float32, 1}, Base.Broadcast.Broadcasted{Nothing, Tuple{Base.OneTo{Int64}}, Zygote.var"#57#58"{typeof(leakyrelu)}, Tuple{Base.Broadcast.Extruded{CuDeviceVector{Float32, 1}, Tuple{Bool}, Tuple{Int64}}}}, Int64) resulted in invalid LLVM IR
Reason: unsupported dynamic function invocation (call to __throw_rational_argerror_zero(T) in Base at rational.jl:32)
Stacktrace:
  [1] Rational
    @ ./rational.jl:34
  [2] Rational
    @ ./rational.jl:39
  [3] //
    @ ./rational.jl:62
  [4] derivatives_given_output
    @ ~/.julia/packages/NNlib/Yn377/src/activations.jl:875
  [5] rrule
    @ ~/.julia/packages/NNlib/Yn377/src/activations.jl:875
  [6] rrule
    @ ~/.julia/packages/ChainRulesCore/IzITE/src/rules.jl:134
  [7] chain_rrule
    @ ~/.julia/packages/Zygote/cCyLF/src/compiler/chainrules.jl:216
  [8] macro expansion
    @ ~/.julia/packages/Zygote/cCyLF/src/compiler/interface2.jl:0
  [9] _pullback
    @ ~/.julia/packages/Zygote/cCyLF/src/compiler/interface2.jl:9
 [10] _pullback
    @ ~/.julia/packages/Zygote/cCyLF/src/compiler/interface.jl:34
 [11] pullback
    @ ~/.julia/packages/Zygote/cCyLF/src/compiler/interface.jl:40
 [12] #57
    @ ~/.julia/packages/Zygote/cCyLF/src/compiler/interface.jl:82
 [13] _broadcast_getindex_evalf
    @ ./broadcast.jl:648
 [14] _broadcast_getindex
    @ ./broadcast.jl:621
 [15] getindex
    @ ./broadcast.jl:575
 [16] broadcast_kernel
    @ ~/.julia/packages/GPUArrays/umZob/src/host/broadcast.jl:59
Reason: unsupported dynamic function invocation (call to __throw_rational_argerror_typemin(T) in Base at rational.jl:20)
Stacktrace:
  [1] checked_den
    @ ./rational.jl:24
  [2] Rational
    @ ./rational.jl:36
  [3] Rational
    @ ./rational.jl:39
  [4] //
    @ ./rational.jl:62
  [5] derivatives_given_output
    @ ~/.julia/packages/NNlib/Yn377/src/activations.jl:875
  [6] rrule
    @ ~/.julia/packages/NNlib/Yn377/src/activations.jl:875
  [7] rrule
    @ ~/.julia/packages/ChainRulesCore/IzITE/src/rules.jl:134
  [8] chain_rrule
    @ ~/.julia/packages/Zygote/cCyLF/src/compiler/chainrules.jl:216
  [9] macro expansion
    @ ~/.julia/packages/Zygote/cCyLF/src/compiler/interface2.jl:0
 [10] _pullback
    @ ~/.julia/packages/Zygote/cCyLF/src/compiler/interface2.jl:9
 [11] _pullback
    @ ~/.julia/packages/Zygote/cCyLF/src/compiler/interface.jl:34
 [12] pullback
    @ ~/.julia/packages/Zygote/cCyLF/src/compiler/interface.jl:40
 [13] #57
    @ ~/.julia/packages/Zygote/cCyLF/src/compiler/interface.jl:82
 [14] _broadcast_getindex_evalf
    @ ./broadcast.jl:648
 [15] _broadcast_getindex
    @ ./broadcast.jl:621
 [16] getindex
    @ ./broadcast.jl:575
 [17] broadcast_kernel
    @ ~/.julia/packages/GPUArrays/umZob/src/host/broadcast.jl:59
HINT: catch this exception as `err` and call `code_typed(err; interactive = true)` to introspect the erronous code
Stacktrace:
  [1] check_ir(job::GPUCompiler.CompilerJob{GPUCompiler.PTXCompilerTarget, CUDA.CUDACompilerParams, GPUCompiler.FunctionSpec{GPUArrays.var"#broadcast_kernel#17", Tuple{CUDA.CuKernelContext, CuDeviceVector{Float32, 1}, Base.Broadcast.Broadcasted{Nothing, Tuple{Base.OneTo{Int64}}, Zygote.var"#57#58"{typeof(leakyrelu)}, Tuple{Base.Broadcast.Extruded{CuDeviceVector{Float32, 1}, Tuple{Bool}, Tuple{Int64}}}}, Int64}}}, args::LLVM.Module)
    @ GPUCompiler ~/.julia/packages/GPUCompiler/I9fZc/src/validation.jl:119
  [2] macro expansion
    @ ~/.julia/packages/GPUCompiler/I9fZc/src/driver.jl:327 [inlined]
  [3] macro expansion
    @ ~/.julia/packages/TimerOutputs/5tW2E/src/TimerOutput.jl:252 [inlined]
  [4] macro expansion
    @ ~/.julia/packages/GPUCompiler/I9fZc/src/driver.jl:325 [inlined]
  [5] emit_asm(job::GPUCompiler.CompilerJob, ir::LLVM.Module; strip::Bool, validate::Bool, format::LLVM.API.LLVMCodeGenFileType)
    @ GPUCompiler ~/.julia/packages/GPUCompiler/I9fZc/src/utils.jl:64
  [6] cufunction_compile(job::GPUCompiler.CompilerJob)
    @ CUDA ~/.julia/packages/CUDA/Axzxe/src/compiler/execution.jl:326
  [7] cached_compilation(cache::Dict{UInt64, Any}, job::GPUCompiler.CompilerJob, compiler::typeof(CUDA.cufunction_compile), linker::typeof(CUDA.cufunction_link))
    @ GPUCompiler ~/.julia/packages/GPUCompiler/I9fZc/src/cache.jl:90
  [8] cufunction(f::GPUArrays.var"#broadcast_kernel#17", tt::Type{Tuple{CUDA.CuKernelContext, CuDeviceVector{Float32, 1}, Base.Broadcast.Broadcasted{Nothing, Tuple{Base.OneTo{Int64}}, Zygote.var"#57#58"{typeof(leakyrelu)}, Tuple{Base.Broadcast.Extruded{CuDeviceVector{Float32, 1}, Tuple{Bool}, Tuple{Int64}}}}, Int64}}; name::Nothing, kwargs::Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ CUDA ~/.julia/packages/CUDA/Axzxe/src/compiler/execution.jl:297
  [9] cufunction
    @ ~/.julia/packages/CUDA/Axzxe/src/compiler/execution.jl:291 [inlined]
 [10] macro expansion
    @ ~/.julia/packages/CUDA/Axzxe/src/compiler/execution.jl:102 [inlined]
 [11] #launch_heuristic#280
    @ ~/.julia/packages/CUDA/Axzxe/src/gpuarrays.jl:17 [inlined]
 [12] copyto!
    @ ~/.julia/packages/GPUArrays/umZob/src/host/broadcast.jl:65 [inlined]
 [13] copyto!
    @ ./broadcast.jl:936 [inlined]
 [14] copy
    @ ~/.julia/packages/GPUArrays/umZob/src/host/broadcast.jl:47 [inlined]
 [15] materialize(bc::Base.Broadcast.Broadcasted{CUDA.CuArrayStyle{1}, Nothing, Zygote.var"#57#58"{typeof(leakyrelu)}, Tuple{CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}})
    @ Base.Broadcast ./broadcast.jl:883
 [16] top-level scope
    @ REPL[7]:1
 [17] top-level scope
    @ ~/.julia/packages/CUDA/Axzxe/src/initialization.jl:52

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions