Conversation
|
Very cool! Does it work? Out of curiosity, how much of this was claude able to do on its own? I no longer own a mac to test this code but this is exciting! Edit: |
setuptools-scm requires git to determine the package version from tags. The Dockerfile was missing git, causing pip install to fail in CI. Two fixes: - Install git alongside curl in the Dockerfile so setuptools-scm can read the version from the copied .git history - Add fallback_version = "0.0.0" to [tool.setuptools_scm] in pyproject.toml as a safety net for git-free environments (shallow clones, tarballs, GitHub zip archives) Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
|
Yep! I ran this example: https://github.com/tractometry/pyAFQ/blob/main/examples/howto_examples/pyAFQ_with_GPU.py, with the code on pyAFQ's main and this PR, and what an experience. 1M streamlines in 24 seconds on my Mac laptop and a full run of AFQ in under 15 minutes 🤯 |
|
That's a great news such a performance! similar approach might be useful with Vulkan for the other graphics card. |
|
@36000 The Metal conversion was not one-shot. It idd require quite a bit of help. Not having a native GPU implementation caused a lot of rough edges. I think I would have been better off creating a WebGPU implementation first on a CUDA ssytem, and then using that as a basis for Metal. However, the issues did allow me to develop some nice validation measures and come up with some ideas for performance improvements (e.g. while not implemented by default the angular weighting seems like a nice enhancement). The conversion from Metal to WebGPU was very smooth and virtually one shot. At that stage I have a mature CLAUDE.md file, a lot of explicit validations, and a local GPU implementation. I worked with Claude to develop a 5 phase plan and each step proceeded smoothly. @skoudoro I have added a WebGPU backend. In theory this should support Intel, AMD, Qualcomm, etc GPUs. I have only tested on Apple and NVidia GPUs. The performance is good but not native. I am closing this PR, as it is a subset of PR37. |
|
Thank you @neurolabusc! Great idea! |
Provide support for Metal shaders on Apple silicon. See the Metal README for details. Highlights include:
Note: This Metal backend was written by Claude Code (Anthropic's AI coding agent), with architectural direction, validation, and iterative review from me. The port is intentionally derivative — it mirrors the naming conventions, file structure, and two-pass kernel architecture of the existing CUDA backend to minimize cognitive overhead for contributors familiar with the codebase.