Skip to content

Conversation

@pxl-th
Copy link
Member

@pxl-th pxl-th commented Dec 15, 2024

  • Use GPUArrays caching allocator.
  • Use lazy Zygote.

Benchmark (benchmark/pipeline.jl) 1k training steps:

  • AMDGPU RX 7900XTX:
Before After
GPU memory utilization image image
Time 93.672892 seconds 46.365646 seconds
  • RTX 3060M:
Before After
Time 148.674127 seconds 112.342980 seconds

@pxl-th pxl-th merged commit e6fa6ea into main Jan 22, 2025
2 checks passed
@pxl-th pxl-th deleted the pxl-th/gpuarrays branch January 22, 2025 17:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants