Skip to content

ramalama not working on older x86 hardware #1145

@kraxel

Description

@kraxel
kraxel@ollivander ~# ramalama --debug run tiny hello
run_cmd:  podman inspect quay.io/ramalama/cuda:0.6
Working directory: None
Ignore stderr: False
Ignore all: True
exec_cmd:  podman run --rm -i --label ai.ramalama --name ramalama_v2MjJibw8H --env=HOME=/tmp --init --runtime /usr/bin/nvidia-container-runtime --security-opt=label=disable --cap-drop=all --security-opt=no-new-privileges --label ai.ramalama.model=ollama://tinyllama --label ai.ramalama.engine=podman --label ai.ramalama.runtime=llama.cpp --label ai.ramalama.command=run --env LLAMA_PROMPT_PREFIX=🦭 >  --pull=newer -t --device /dev/dri --device nvidia.com/gpu=all -e CUDA_VISIBLE_DEVICES=0 --network none --mount=type=bind,src=/home/kraxel/.local/share/ramalama/models/ollama/tinyllama:latest,destination=/mnt/models/model.file,ro quay.io/ramalama/cuda:latest llama-run -c 2048 --temp 0.8 -v --ngl 999 /mnt/models/model.file hello
Loading modelggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 CUDA devices:
  Device 0: NVIDIA GeForce GTX 1060 6GB, compute capability 6.1, VMM: yes
kraxel@ollivander ~# dmesg | tail -1
[  401.460183] traps: llama-run[1907] trap invalid opcode ip:7f2d23f352ac sp:7ffcf9dbfd20 error:0 in libggml-cpu.so[3a2ac,7f2d23f03000+60000]
kraxel@ollivander ~# lscpu | grep avx
kraxel@ollivander ~# 

I suspect libggml-cpu.so goes use AVX instructions without checking the CPU actually supports them.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions