-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Description
I am requesting a feature that allows uv to automatically select the correct CPU or GPU backend wheel variant for packages that ship multiple prebuilt versions - not only for PyTorch, but also for ecosystem libraries like xformers and flash-attn.
Currently:
-
torch-backend = "auto"correctly installs CUDA / ROCm / CPU variants for PyTorch -
But companion packages like
xformersandflash-attnare not matched to the selected backend -
This frequently results in:
- CUDA PyTorch
- CPU-only xformers
- -> leading to missing CUDA kernels and runtime errors
This happens because the CUDA wheels for xformers and flash-attn are not served fully on PyPI when cuda enabled , so uv (like pip) defaults to CPU wheels.
And because xformers is installed without CUDA support, the resolver may downgrade PyTorch to CPU, or install mismatched wheel variants, causing conflicts and inconsistent environments. Avoiding this currently requires optional dependencies, platform markers, or manual wheel URLs, which creates unnecessary overhead.
Requested Behavior
Extend torch-backend logic into a general backend-aware resolver, e.g.:
# uv.toml
[tool.uv.pip]
backend = "auto" # Detect CPU / CUDA / ROCm
auto-match-packages = ["xformers", "flash-attn"]Then:
uv syncuv should:
- Detect the system backend (CPU / CUDA version / ROCm)
- Select the correct wheel sources
- Install matching GPU/CPU wheel variants across packages consistently
Additionally, I am requesting support for backend-aware index selection when multiple CUDA/CPU indexes are defined, e.g.:
[tool.uv.sources]
torch = [
{ index = "pytorch-cpu" },
{ index = "pytorch-cu128", marker = "sys_platform == 'linux' or sys_platform == 'win32'" },
{ index = "pytorch-cu129", marker = "sys_platform == 'linux' or sys_platform == 'win32'" },
{ index = "pytorch-cu130", marker = "sys_platform == 'linux' or sys_platform == 'win32'" },
]
torchvision = [
{ index = "pytorch-cpu" },
{ index = "pytorch-cu128", marker = "sys_platform == 'linux' or sys_platform == 'win32'" },
{ index = "pytorch-cu129", marker = "sys_platform == 'linux' or sys_platform == 'win32'" },
{ index = "pytorch-cu130", marker = "sys_platform == 'linux' or sys_platform == 'win32'" },
]
[[tool.uv.index]]
name = "pytorch-cu128"
url = "https://download.pytorch.org/whl/cu128"
explicit = true
[[tool.uv.index]]
name = "pytorch-cu129"
url = "https://download.pytorch.org/whl/cu129"
explicit = true
[[tool.uv.index]]
name = "pytorch-cu130"
url = "https://download.pytorch.org/whl/cu130"
explicit = true
[[tool.uv.index]]
name = "pytorch-cpu"
url = "https://download.pytorch.org/whl/cpu"
explicit = trueuv should automatically select the correct index based on detected backend, and ensure that companion packages (such as xformers and flash-attn) resolve against the same backend index without requiring manual markers or optional dependency trees.
Note: this feature may eliminate the need for optional dependencies for simplicity :)
Why This Matters
-
xformersGPU wheels must match PyTorch's CUDA version -
Incorrect fallback to CPU wheels causes:
- Failed fused attention ops
- Silent CPU fallback and performance drops
- Common runtime errors (
No available CUDA kernel)
-
This is currently the failure point for new ML users setting up environments
I am aware of existing docs
This request builds upon:
https://docs.astral.sh/uv/guides/integration/pytorch/
https://docs.astral.sh/uv/reference/settings/#pip_torch-backend
https://docs.astral.sh/uv/concepts/projects/config/#augmenting-build-dependencies
But expands automatic backend resolution beyond PyTorch to the entire CUDA/ROCm ecosystem.
Example
Current behavior:
uv add torch xformers flash-attnSince no CUDA index is provided, this installs:
torch (CPU)
xformers (CPU)
flash-attn (CPU)
Trying to force CUDA manually:
uv add torch xformers flash-attn --index-url https://download.pytorch.org/whl/cu128Current result with uv:
torch (CUDA 12.8)
xformers (CPU-only or mismatched version)
flash-attn (CPU-only or fails to resolve)
Because uv only applies the index to torch and not to related ecosystem packages, leading to conflicts and inconsistent backends.
With the requested feature:
# uv.toml
[tool.uv.pip]
backend = "auto"
auto-match-packages = ["xformers", "flash-attn"]uv syncResult:
torch (CUDA)
xformers (CUDA)
flash-attn (CUDA)
All resolved to the same detected backend, with no manual index flags, no platform markers, and no optional dependency trees.
Thank you for reviewing this request :)