Skip to content

Metal support#36

Closed
neurolabusc wants to merge 2 commits intodipy:masterfrom
neurolabusc:master
Closed

Metal support#36
neurolabusc wants to merge 2 commits intodipy:masterfrom
neurolabusc:master

Conversation

@neurolabusc
Copy link

@neurolabusc neurolabusc commented Mar 1, 2026

Provide support for Metal shaders on Apple silicon. See the Metal README for details. Highlights include:

  • Faster
  • Optional Soft Angular Weighting biases trajectory continuation toward directions close to the current heading, applied only at voxels with multi-directional ODFs. This reflects the physiological prior that white matter fibers follow smooth, gradually curving paths — axons do not make sharp turns. In ambiguous voxels where fiber bundles run in different directions (the "kissing vs. crossing" problem), this weighting favors the interpretation most consistent with the incoming trajectory, reducing spurious crossings into adjacent bundles.
  • SH basis fix in cu_direction_getters.py and the shared boot_utils.py module also benefits the CUDA path. This should make GPU and CPU results more similar.

Note: This Metal backend was written by Claude Code (Anthropic's AI coding agent), with architectural direction, validation, and iterative review from me. The port is intentionally derivative — it mirrors the naming conventions, file structure, and two-pass kernel architecture of the existing CUDA backend to minimize cognitive overhead for contributors familiar with the codebase.

@36000
Copy link
Collaborator

36000 commented Mar 2, 2026

Very cool! Does it work? Out of curiosity, how much of this was claude able to do on its own?

I no longer own a mac to test this code but this is exciting!

Edit:
Just read your email. It sounds like it is indeed working! That is very impressive translation work from claude. And thank you for fixing the bootstrapping code.

setuptools-scm requires git to determine the package version from tags.
The Dockerfile was missing git, causing pip install to fail in CI.

Two fixes:
- Install git alongside curl in the Dockerfile so setuptools-scm can
  read the version from the copied .git history
- Add fallback_version = "0.0.0" to [tool.setuptools_scm] in
  pyproject.toml as a safety net for git-free environments (shallow
  clones, tarballs, GitHub zip archives)

Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
@arokem
Copy link
Collaborator

arokem commented Mar 2, 2026

Yep! I ran this example: https://github.com/tractometry/pyAFQ/blob/main/examples/howto_examples/pyAFQ_with_GPU.py, with the code on pyAFQ's main and this PR, and what an experience. 1M streamlines in 24 seconds on my Mac laptop and a full run of AFQ in under 15 minutes 🤯

@skoudoro
Copy link
Member

skoudoro commented Mar 3, 2026

That's a great news such a performance!

similar approach might be useful with Vulkan for the other graphics card.

@neurolabusc
Copy link
Author

@36000 The Metal conversion was not one-shot. It idd require quite a bit of help. Not having a native GPU implementation caused a lot of rough edges. I think I would have been better off creating a WebGPU implementation first on a CUDA ssytem, and then using that as a basis for Metal. However, the issues did allow me to develop some nice validation measures and come up with some ideas for performance improvements (e.g. while not implemented by default the angular weighting seems like a nice enhancement). The conversion from Metal to WebGPU was very smooth and virtually one shot. At that stage I have a mature CLAUDE.md file, a lot of explicit validations, and a local GPU implementation. I worked with Claude to develop a 5 phase plan and each step proceeded smoothly.

@skoudoro I have added a WebGPU backend. In theory this should support Intel, AMD, Qualcomm, etc GPUs. I have only tested on Apple and NVidia GPUs. The performance is good but not native.

I am closing this PR, as it is a subset of PR37.

@Garyfallidis
Copy link
Collaborator

Garyfallidis commented Mar 3, 2026

Thank you @neurolabusc! Great idea!

@neurolabusc neurolabusc closed this Mar 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants