Skip to content

sci-ml/ollama: fix broken CUDA support via dynamic GPU detection#409

Open
nabbi wants to merge 1 commit intogentoo:masterfrom
nabbi:ollama-cuda-gpu
Open

sci-ml/ollama: fix broken CUDA support via dynamic GPU detection#409
nabbi wants to merge 1 commit intogentoo:masterfrom
nabbi:ollama-cuda-gpu

Conversation

@nabbi
Copy link

@nabbi nabbi commented Dec 29, 2025

Fix an issue where USE=cuda builds failed to provide GPU acceleration. Previous "native" build attempts were non-functional due to sandbox restrictions and incorrect architecture targeting.

  • Implement smart CUDAARCHA detection using __nvcc_device_query.
  • Add sandbox-aware hardware check to prevent 0x64 (No Device) errors.
  • Disable GGML_NATIVE to ensure specific GPU kernels are generated.
  • Default to 'all' (complete binary) if hardware is inaccessible.
  • Add pkg_pretend guidance for binary package (binpkg) portability.
  • Remove duplicate back end install CUDA tensor upload failures during model load ollama/ollama#13614

Thanks :)

@nabbi
Copy link
Author

nabbi commented Jan 5, 2026

@negril FYA

@nabbi
Copy link
Author

nabbi commented Jan 5, 2026

We still need to address the concern of "backends end up in /usr/bin otherwise"

@negril
Copy link
Contributor

negril commented Jan 5, 2026

Building without -DGGML_BACKEND_DIR="${EPREFIX}/usr/$(get_libdir)/${PN}/backends" yields ( via app-portage/iwdevtools ):

 * CMP: =sci-ml/ollama-9999 with sci-ml/ollama-9999/image
 *  FILES:+usr/bin/libggml-cpu-haswell.so
 *  FILES:+usr/bin/libggml-cpu-sandybridge.so
 *  FILES:+usr/bin/libggml-cpu-sse42.so
 *  FILES:+usr/bin/libggml-cpu-x64.so
 *  FILES:+usr/bin/libggml-cuda.so
 *  FILES:+usr/bin/libggml-vulkan.so
 *  FILES:-usr/lib64/ollama/backends/libggml-cpu-haswell.so
 *  FILES:-usr/lib64/ollama/backends/libggml-cpu-sandybridge.so
 *  FILES:-usr/lib64/ollama/backends/libggml-cpu-sse42.so
 *  FILES:-usr/lib64/ollama/backends/libggml-cpu-x64.so
 *  FILES:-usr/lib64/ollama/backends/libggml-cuda.so
 *  FILES:-usr/lib64/ollama/backends/libggml-vulkan.so
 * ------> FILES(+6,-6)

This is an issue upstream needs to fix ( and is most likely caused by the incomplete import and abuse of ggml by ollama ).

You can try and see if setting -DGGML_BACKEND_DIR="${EPREFIX}/usr/$(get_libdir)/${PN}" fixes your issue. Otherwise we need to see how we correct or remove the dupes.


addpredict "/dev/char/" is needed once we remove the SANDBOX_PREDICT entries from nvidia-cuda-toolkit.


Passing -DGGML_NATIVE=OFF has no effect for cuda once we pass CMAKE_CUDA_ARCHITECTURES, see https://github.com/ollama/ollama/blob/v0.13.5/ml/backend/ggml/ggml/src/ggml-cuda/CMakeLists.txt#L25
But it will change behaviour for the cpu backend. So I'd rather not pass that.


The cuda changes are an amalgamation of my cuda stuff in various ebuilds. I'll see how much it differs from the wip eclass and if it makes sense to add that to ::guru for the time being.

@nabbi
Copy link
Author

nabbi commented Jan 5, 2026

Okay. I'll tweak again with your feedback and test again, yeah I think that'll resolve the duplication.

Fix an issue where USE=cuda builds failed to provide GPU acceleration.
Previous "native" build attempts were non-functional due to sandbox
restrictions and incorrect architecture targeting.

- Implement smart CUDAARCHS detection using __nvcc_device_query.
- Add sandbox-aware hardware check to prevent 0x64 (No Device) errors.
- Disable GGML_NATIVE to ensure specific GPU kernels are generated.
- Default to 'all' (fat binary) if hardware is inaccessible.
- Add pkg_pretend guidance for binary package (binpkg) portability.
- Fix duplicate library install.

Signed-off-by: Nic Boet <nic@boet.cc>
@nabbi
Copy link
Author

nabbi commented Jan 5, 2026

Update. lmk if I missed the mark.

Yes. I found various approaches of handing CMAKE_CUDA_ARCHITECTURES in the tree. Would be awesome to have some standardization in the cuda.eclass for this and documentation of a global CUDAARCHS :)
So I tried a best guess to keep this aligned but also added a bit more logging to confirm it was no longer building as "native". Changes welcomed

@nabbi
Copy link
Author

nabbi commented Jan 5, 2026

/var/tmp/portage/sci-ml/ollama-9999# tree image/
image/
├── etc
│   ├── conf.d
│   │   └── ollama
│   └── init.d
│       └── ollama
└── usr
    ├── bin
    │   └── ollama
    ├── lib
    │   └── systemd
    │       └── system
    │           └── ollama.service
    ├── lib64
    │   └── ollama
    │       ├── libggml-base.so -> libggml-base.so.0
    │       ├── libggml-base.so.0 -> libggml-base.so.0.0.0
    │       ├── libggml-base.so.0.0.0
    │       ├── libggml-cpu-x64.so
    │       └── libggml-cuda.so
    └── share
        └── doc
            └── ollama-9999
                └── README.md.bz2

14 directories, 10 files

@negril
Copy link
Contributor

negril commented Jan 5, 2026

But does it work? I'll look at the cuda stuff tomorrow.

@nabbi
Copy link
Author

nabbi commented Jan 5, 2026

But does it work?

Yes! It's a little quirky as it's attempting to reinstall on top of libggml-cpu-x64.so and libggml-cuda.so
The copy operation is happening twice, second time reports "Up-to-date"

-- Installing: /var/tmp/portage/sci-ml/ollama-9999/image/usr/lib64/ollama/libggml-cpu-x64.so
-- Set non-toolchain portion of runtime path of "/var/tmp/portage/sci-ml/ollama-9999/image/usr/lib64/ollama/libggml-cpu-x64.so" to ""
-- Up-to-date: /var/tmp/portage/sci-ml/ollama-9999/image/usr/lib64/ollama/libggml-cpu-x64.so

-- Installing: /var/tmp/portage/sci-ml/ollama-9999/image/usr/lib64/ollama/libggml-cuda.so
-- Set non-toolchain portion of runtime path of "/var/tmp/portage/sci-ml/ollama-9999/image/usr/lib64/ollama/libggml-cuda.so" to ""
-- Up-to-date: /var/tmp/portage/sci-ml/ollama-9999/image/usr/lib64/ollama/libggml-cuda.so

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants