Skip to content

GPU kernel InvalidIRError: pois_rand(PassthroughRNG, λ) hits jl_f_throw_methoderror on Julia 1.12 / latest CUDACore #588

@ChrisRackauckas-Claude

Description

@ChrisRackauckas-Claude

After PR #587 fixes the UndefVarError: validate_pure_leaping_inputs, the GPU tests progress past validation but now fail at kernel compilation with a separate pre-existing issue.

Symptom

From SciML/JumpProcesses.jl actions run 25629456752:

LoadError: InvalidIRError: compiling MethodInstance for
  JumpProcessesKernelAbstractionsExt.gpu_simple_tau_leaping_kernel(...)
  resulted in invalid LLVM IR
Reason: unsupported call to an unknown function (call to jl_f_throw_methoderror)
Stacktrace:
 [1] rand        @ Random/src/generation.jl:114    # Julia 1.12.6
 [2] rand        @ Random/src/Random.jl:255
 [3] randexp     @ CUDACore/src/device/random.jl:339   (device_override)
 [4] count_rand  @ PoissonRandom/MLLFD/src/PoissonRandom.jl:18
 [5] pois_rand   @ PoissonRandom/MLLFD/src/PoissonRandom.jl:137
 [6] kernel body @ ext/JumpProcessesKernelAbstractionsExt.jl:135

Reproducing call site

# ext/JumpProcessesKernelAbstractionsExt.jl:135
counts[k] = pois_rand(PoissonRandom.PassthroughRNG(), rate_cache[k])

Dispatch chain on the GPU device

  • randexp(::PassthroughRNG) (PoissonRandom) → bare randexp() with no rng
  • randexp()randexp(default_rng())randexp(::Philox2x32) (CUDACore @device_override makes default_rng() return Philox2x32() on device)
  • randexp(rng::AbstractRNG) @device_override at CUDACore/src/device/random.jl:339 calls Random.rand(rng, Random.UInt52Raw())
  • rand(rng, X) at Random.jl:255 → Sampler(rng, X, Val(1)) then rand(rng, sampler)
  • rand(r::AbstractRNG, ::SamplerTrivial{UInt52Raw{UInt64}}) at generation.jl:114 → _rand52(r, rng_native_52(r))

Somewhere along this chain on Julia 1.12.6 + latest CUDACore there is a path that resolves to throw(MethodError(...)) which GPUCompiler can't prove unreachable, hence InvalidIRError.

Environment from the failing run

  • Julia 1.12.6
  • CUDACore version gtlJx
  • PoissonRandom version MLLFD (older 0.4.x, with Random.rand(::PassthroughRNG) = rand() pattern)
  • KernelAbstractions ecO4B
  • GPUCompiler lHkad

Hypotheses to investigate

  1. PoissonRandom PassthroughRNG design hits a Julia-1.12-specific method-table edge case. The PassthroughRNG defines only Random.rand(rng) = rand(), Random.randexp(rng) = randexp(), Random.randn(rng) = randn() (no second-argument methods). On Julia 1.12, perhaps a new Sampler machinery infers a path that calls rand(rng, T) for some T and lacks a method, yielding a statically-reachable throw(MethodError).
  2. CUDACore's randexp(rng::AbstractRNG) override paired with Philox2x32 triggers a path to rand(rng, UInt52Raw()) whose sampler-construction sequence on Julia 1.12 reaches throw(MethodError).
  3. The fix may live in PoissonRandom (define a richer set of Random.rand(rng::PassthroughRNG, ...) methods, or drop PassthroughRNG in favor of using default_rng() on device). Note PoissonRandom 0.4.7 (master) exists, the failing run still uses an older artifact dir — confirm Project.toml compat allows the latest.

Suggested fix paths (not yet tried)

  • Replace pois_rand(PassthroughRNG(), λ) with pois_rand(λ) so the call uses Random.default_rng(), which CUDACore already overrides on device. Conceptually identical chain on the device but skips the PassthroughRNG → bare-fn indirection.
  • Hand-roll the few randexp calls inside the kernel using device intrinsics directly instead of going through PoissonRandom (most surgical for this kernel).
  • Update PoissonRandom to define a complete rand/randexp/randn family for PassthroughRNG that takes Type{T} arguments and forwards to bare rand(T) / etc., so dispatch closes statically on Julia 1.12.

Reproducer

The existing test test/gpu/regular_jumps.jl (SIR / SEIR with SimpleTauLeaping + EnsembleGPUKernel(CUDABackend())) hits this on Julia 1.12 + latest CUDACore.

Why this isn't fixed in #587

PR #587 is a one-line validate_pure_leaping_inputs qualification fix. The IR error is a different and deeper issue requiring separate investigation, and benefits from being its own focused PR. Per CLAUDE.md small-PR philosophy.

🤖 Reported by Claude Code while iterating on PR #587.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions