Skip to content

ollama: 0.5.7 -> 0.5.11#383387

Merged
prusnak merged 1 commit intoNixOS:masterfrom
Pleune:ollama-0.5.11
Feb 20, 2025
Merged

ollama: 0.5.7 -> 0.5.11#383387
prusnak merged 1 commit intoNixOS:masterfrom
Pleune:ollama-0.5.11

Conversation

@Pleune
Copy link
Copy Markdown
Contributor

@Pleune Pleune commented Feb 19, 2025

Upstream overhauled their build system. For nixpkgs, this means cmake instead of make to compile the ggml libs.

This update should mean major performance improvements to those running with CPU-only, as upstream now supports CPU generation specific optimized ggml libs.

I have built this with and without cuda and both are working on my end. I can't test with ROCM.

I also want to note here that setting gcc.arch breaks this build. But that is not new. This is because of a nvcc / gcc incompatability bug. Below are some context links I want to combine here just to make it easier for myself to find in the future. When setting -march, only a single function call in an init function is holding onto this bug. The rest have been avoided by the llama.cpp team. I can patch that out with an alternative, and the builds do get fixed with, on my system, around a 3% performance gain. However, something else is going on and some model architectures produce completely gibberish output. I can't seem to fix that. Nvidia claims new version of the toolkit add support for the required gcc builtins, but I tried 12.6 and 12.8 and the issue still exists. Ubuntu appears to patch the gcc amximmintrin.h file as suggested in the Nvidia forums, as when I check with gcc version >= 13.2 they still have the old header.

Things done

  • Built on platform(s)
    • x86_64-linux
    • aarch64-linux
    • x86_64-darwin
    • aarch64-darwin
  • For non-Linux: Is sandboxing enabled in nix.conf? (See Nix manual)
    • sandbox = relaxed
    • sandbox = true
  • Tested, as applicable:
  • Tested compilation of all packages that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD". Note: all changes have to be committed, also see nixpkgs-review usage
  • Tested basic functionality of all binary files (usually in ./result/bin/)
  • 25.05 Release Notes (or backporting 24.11 and 25.05 Release notes)
    • (Package updates) Added a release notes entry if the change is major or breaking
    • (Module updates) Added a release notes entry if the change is significant
    • (Module addition) Added a release notes entry if adding a new NixOS module
  • Fits CONTRIBUTING.md.

Add a 👍 reaction to pull requests you find important.

@github-actions github-actions bot added 10.rebuild-darwin: 1-10 This PR causes between 1 and 10 packages to rebuild on Darwin. 10.rebuild-linux: 1-10 This PR causes between 1 and 10 packages to rebuild on Linux. labels Feb 19, 2025
@abysssol
Copy link
Copy Markdown
Contributor

I'm unable to get ollama-rocm to utilize my GPU. I set HSA_OVERRIDE_GFX_VERSION, and ollama serve logs that it is using the rocm runner, but I don't see any GPU utilization in radeontop, not with large or small models.

Also, ollama-rocm fails to build on this branch, but after rebasing it on the most recent master, it seems to build fine, so this PR isn't at fault. However, it seems you didn't even test that ollama-rocm builds on your branch? Were you unaware that neither an AMD GPU nor ROCm drivers are required to build ollama-rocm?

CI now checks that nix files are formatted with nixfmt, so try not to forget that. I configured my text editor to use nixfmt on file save, you may want to consider doing something similar.

@Pleune
Copy link
Copy Markdown
Contributor Author

Pleune commented Feb 20, 2025

I'm unable to get ollama-rocm to utilize my GPU. I set HSA_OVERRIDE_GFX_VERSION, and ollama serve logs that it is using the rocm runner, but I don't see any GPU utilization in radeontop, not with large or small models.

Also, ollama-rocm fails to build on this branch, but after rebasing it on the most recent master, it seems to build fine, so this PR isn't at fault. However, it seems you didn't even test that ollama-rocm builds on your branch? Were you unaware that neither an AMD GPU nor ROCm drivers are required to build ollama-rocm?

CI now checks that nix files are formatted with nixfmt, so try not to forget that. I configured my text editor to use nixfmt on file save, you may want to consider doing something similar.

Sorry, I originally was on a newer master but actually cherry packed back to a few weeks ago, so I could point my system flakes at this to try with gcc.arch set (without 3 days of compiling everything under this).

I can reformat if that is best. I generally prefer not to reformat lines I'm not changing to keep the git blame simpler, and it seems like this was not formatted in the past. I do have nixfmt set up.

Are you able to run with strace in case it's a lib issue? I didn't have to change anything for cuda to work, but certainly the build system changed pretty dramatically.

@Pleune
Copy link
Copy Markdown
Contributor Author

Pleune commented Feb 20, 2025

Debugging a little more, when I build with rocm acceleration, the cmake variable AMDGPU_TARGETS remains empty, so ggml-hip is not built. I'm guessing this is the culprit. So I am taking inspiration from how whisper-cpp sets -DAMDGPU_TARGETS, and it look like now the proper ggml libs are built. I have also mirrored this setup based on how koboldcpp has a cudaArches input to make ollama more overridable.

Can you try again with this update?

@Pleune Pleune force-pushed the ollama-0.5.11 branch 2 times, most recently from 40c0eb0 to 52d4d8a Compare February 20, 2025 16:31
@Pleune Pleune force-pushed the ollama-0.5.11 branch 3 times, most recently from aa9bfc0 to cc7a2bc Compare February 20, 2025 19:53
Upstream overhauled their build system. For nixpkgs, this means cmake
instead of make to compile the ggml libs.

Added two new package inputs, `cudaArches` and `rocmGpuTargets` for more
configurability when overriding.

Co-Authored-By: Pavol Rusnak <[email protected]>
@prusnak
Copy link
Copy Markdown
Member

prusnak commented Feb 20, 2025

I have simplified the changes somehow in 2186263:

  • removed the cross-compilation fixes
  • used fetchpatch to apply the patch
  • refactor cmakeFlagsCudaArchitectures and cmakeFlagsRocmTargets

I also tested that everything compiles just nice on all 4 platforms: linux-x86_64 linux-aarch64 darwin-x86_64 darwin-aarch64

I will now compile cuda and rocm on linux-x86_64 and merge if everything is OK

Thank you!

@Pleune
Copy link
Copy Markdown
Contributor Author

Pleune commented Feb 20, 2025

I have simplified the changes somehow in 2186263:

* removed the cross-compilation fixes

* used fetchpatch to apply the patch

* refactor cmakeFlagsCudaArchitectures and cmakeFlagsRocmTargets

I also tested that everything compiles just nice on all 4 platforms: linux-x86_64 linux-aarch64 darwin-x86_64 darwin-aarch64

I will now compile cuda and rocm on linux-x86_64 and merge if everything is OK

Thank you!

It looks like the crossPkgs build has broken with the removed CMAKE_SYSTEM_{NAME,PROCESSOR}, and the ggml install removal patch not being here anymore. You may want to add those back in

❯ nix build github:NixOS/nixpkgs/21862634ccc7f2453cad20ea2c354f4539d90e13#pkgsCross.aarch64-multiplatform.ollama
error: builder for '/nix/store/l73nhybcr0h0xhq28gl640sx705mn7nm-ollama-aarch64-unknown-linux-gnu-0.5.11.drv' failed with exit code 2;
       last 25 log lines:
       > aarch64-unknown-linux-gnu-gcc: error: unrecognized command-line option '-mavx512vnni'
       > aarch64-unknown-linux-gnu-g++: error: unrecognized command-line option '-mamx-tile'
       > aarch64-unknown-linux-gnu-g++: error: unrecognized command-line option '-mavx512bf16'
       > aarch64-unknown-linux-gnu-g++: error: unrecognized command-line option '-mamx-int8'
       > make[2]: *** [ml/backend/ggml/ggml/src/CMakeFiles/ggml-cpu-sapphirerapids.dir/build.make:177: ml/backend/ggml/ggml/src/CMakeFiles/ggml-cpu-sapphirerapids.dir/ggml-cpu/amx/mmq.cpp.o] Error 1
       > aarch64-unknown-linux-gnu-g++: error: unrecognized command-line option '-mavx512vnni'
       > aarch64-unknown-linux-gnu-g++: error: unrecognized command-line option '-mamx-tile'
       > aarch64-unknown-linux-gnu-g++: error: unrecognized command-line option '-mavx512vbmi'
       > aarch64-unknown-linux-gnu-gcc: error: unrecognized command-line option '-mavx512bf16'
       > aarch64-unknown-linux-gnu-g++: error: unrecognized command-line option '-mamx-int8'
       > make[2]: *** [ml/backend/ggml/ggml/src/CMakeFiles/ggml-cpu-sapphirerapids.dir/build.make:121: ml/backend/ggml/ggml/src/CMakeFiles/ggml-cpu-sapphirerapids.dir/ggml-cpu/ggml-cpu-hbm.cpp.o] Error 1
       > aarch64-unknown-linux-gnu-gcc: error: unrecognized command-line option '-mamx-tile'
       > aarch64-unknown-linux-gnu-gcc: error: unrecognized command-line option '-mamx-int8'
       > make[2]: *** [ml/backend/ggml/ggml/src/CMakeFiles/ggml-cpu-sapphirerapids.dir/build.make:135: ml/backend/ggml/ggml/src/CMakeFiles/ggml-cpu-sapphirerapids.dir/ggml-cpu/ggml-cpu-quants.c.o] Error 1
       > aarch64-unknown-linux-gnu-g++: error: unrecognized command-line option '-mavx512bf16'
       > aarch64-unknown-linux-gnu-g++: error: unrecognized command-line option '-mavx512vnni'
       > aarch64-unknown-linux-gnu-g++: error: unrecognized command-line option '-mamx-tile'
       > aarch64-unknown-linux-gnu-g++: error: unrecognized command-line option '-mamx-int8'
       > make[2]: *** [ml/backend/ggml/ggml/src/CMakeFiles/ggml-cpu-sapphirerapids.dir/build.make:149: ml/backend/ggml/ggml/src/CMakeFiles/ggml-cpu-sapphirerapids.dir/ggml-cpu/ggml-cpu-traits.cpp.o] Error 1
       > aarch64-unknown-linux-gnu-g++: error: unrecognized command-line option '-mavx512bf16'
       > aarch64-unknown-linux-gnu-g++: error: unrecognized command-line option '-mamx-tile'
       > aarch64-unknown-linux-gnu-g++: error: unrecognized command-line option '-mamx-int8'
       > make[2]: *** [ml/backend/ggml/ggml/src/CMakeFiles/ggml-cpu-sapphirerapids.dir/build.make:163: ml/backend/ggml/ggml/src/CMakeFiles/ggml-cpu-sapphirerapids.dir/ggml-cpu/amx/amx.cpp.o] Error 1
       > make[1]: *** [CMakeFiles/Makefile2:609: ml/backend/ggml/ggml/src/CMakeFiles/ggml-cpu-sapphirerapids.dir/all] Error 2
       > make: *** [Makefile:136: all] Error 2
       For full logs, run 'nix log /nix/store/l73nhybcr0h0xhq28gl640sx705mn7nm-ollama-aarch64-unknown-linux-gnu-0.5.11.drv'.
mitch in  jump in nixpkgs on  master [⇡] took 8s

@prusnak
Copy link
Copy Markdown
Member

prusnak commented Feb 20, 2025

It looks like the crossPkgs build has broken with the removed CMAKE_SYSTEM_{NAME,PROCESSOR} not being here anymore. You may want to add those back in

I don't think it is worth complicating this package with cross compilation. Natively the package builds fine.

@prusnak
Copy link
Copy Markdown
Member

prusnak commented Feb 20, 2025

I don't think it is worth complicating this package with cross compilation. Natively the package builds fine.

The proper thing to do (if we ever wanted cross compilation in the future) would be to leverage cmakeFlags and especially makeCMakeFlags function somehow to do the magic for us.

@prusnak
Copy link
Copy Markdown
Member

prusnak commented Feb 20, 2025

cuda and rocm builds went fine - merging

@prusnak prusnak merged commit db27914 into NixOS:master Feb 20, 2025
24 of 27 checks passed
@abysssol
Copy link
Copy Markdown
Contributor

I've tested ollama-rocm, and it does appear to properly utilize my GPU now.

Thank you @Pleune and @prusnak!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

10.rebuild-darwin: 1-10 This PR causes between 1 and 10 packages to rebuild on Darwin. 10.rebuild-linux: 1-10 This PR causes between 1 and 10 packages to rebuild on Linux.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants