A low-level, high-performance GPU-based radix sort plugin for Bevy, optimized for sorting key/value pairs of type u32.
- Based on the paper Fast 4-way parallel radix sorting on GPUs
- High-performance
radix sortimplementation fully executed on the GPU - High compatibility, capable of running on most modern GPUs
- Efficient for large datasets with minimal CPU overhead
- Currently not supported on web platforms (due to the lack of push_constants support in the WebGPU standard)
- Optimized specifically for
u32key/value pairs
Add the following dependency to your Cargo.toml:
[dependencies]
bevy_radix_sort = "0.15.0"Important Note on GPU Benchmarking:
Designing automated GPU benchmarks in Bevy presents significant challenges. Benchmark results can vary widely across different platforms, hardware configurations, and driver versions, making it difficult to provide standardized performance metrics.
For accurate performance analysis, I recommend using GPU-specific profiling tools:
- NVIDIA: Nsight Graphics
- AMD: Radeon GPU Profiler
- Apple: Metal System Trace in Xcode
- Intel: Graphics Performance Analyzers
- Vulkan/Cross-platform: RenderDoc
These tools provide detailed insights into GPU execution times, memory usage, and potential bottlenecks that simple timing measurements cannot capture.
Below are benchmark results from testing on an NVIDIA RTX 4070 Ti Super:
| Number of Key-Value Pairs | Execution Time (ms) |
|---|---|
| 10,000 | 0.12 |
| 100,000 | 0.19 |
| 1,000,000 | 0.43 |
| 10,000,000 | 2.10 |
Note: Performance may vary based on system configuration, driver version, and concurrent GPU workloads.
| bevy_radix_sort | bevy |
|---|---|
| 0.15 | 0.15 |
Check out the example implementation to see how to integrate the radix sort into your Bevy application.
- Bevy Millions Ball: A high-performance collision detection system capable of simulating millions of spheres in real-time. This project uses
bevy_radix_sortas its core algorithm for spatial partitioning and efficient collision detection, demonstrating the plugin's effectiveness in large-scale physics simulations.
Contributions are welcome! Feel free to submit issues or pull requests on our GitHub repository.
Licensed under either of
- Apache License, Version 2.0 (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0)
- MIT license (LICENSE-MIT or http://opensource.org/licenses/MIT)
at your option.



