Skip to content

fix(docker): resolve llama.cpp header overwrite causing SIGSEGV on startup#24

Merged
orneryd merged 1 commit intoorneryd:mainfrom
bellorr:fix/docker-llama-header-overwrite
Feb 28, 2026
Merged

fix(docker): resolve llama.cpp header overwrite causing SIGSEGV on startup#24
orneryd merged 1 commit intoorneryd:mainfrom
bellorr:fix/docker-llama-header-overwrite

Conversation

@bellorr
Copy link
Copy Markdown
Contributor

@bellorr bellorr commented Feb 28, 2026

Environment

  • Machine: Apple M1 Pro (arm64)
  • OS: macOS 26.3
  • Docker: 29.2.1
  • Deployment: timothyswt/nornicdb-arm64-metal-bge-heimdall:1.0.12 via docker-compose, configured with NORNICDB_HEIMDALL_ENABLED=true, NORNICDB_HEIMDALL_PROVIDER=local, NORNICDB_HEIMDALL_GPU_LAYERS=0, NORNICDB_EMBEDDING_GPU_LAYERS=0

Problem

Image 1.0.12 crashes on startup — the container never passes its healthcheck and exits immediately. Image 1.0.11 works fine on the same machine with the same config.

Crash output (1.0.12):

🛡️ Loading Heimdall model: /app/models/qwen3-0.6b-instruct.gguf
   GPU layers: -1 (-1 = auto, falls back to CPU if needed)
llama_init_from_model: failed to initialize the context:
  the backend samplers must be of type llama_sampler_chain
SIGSEGV: segmentation violation
PC=0x0 m=3 sigcode=1 addr=0x0
signal arrived during cgo execution

goroutine 31 gp=0x330d050241e0 m=3 mp=0x330cf9f91008 [syscall]:
runtime.cgocall(0x188e360, 0x330cfa2f13b8)
github.com/orneryd/nornicdb/pkg/localllm._Cfunc_create_gen_context(...)
github.com/orneryd/nornicdb/pkg/localllm.LoadGenerationModel(...)
    /build/pkg/localllm/llama.go:999
github.com/orneryd/nornicdb/pkg/heimdall.cgoGeneratorLoader(...)
    /build/pkg/heimdall/generator_cgo.go:25

The reranker hits the same crash independently:

github.com/orneryd/nornicdb/pkg/localllm.LoadRerankerModel(...)
    /build/pkg/localllm/llama.go:899
github.com/orneryd/nornicdb/pkg/server.New.func2()
    /build/pkg/server/server.go:1110

Root Cause

All affected Dockerfiles use LLAMA_VERSION=b8157. The build stage correctly clones b8157 and copies its headers into /build/lib/llama/ — but the next instruction immediately overwrites them:

# Stage 3 builder (current broken order)
COPY --from=llama /out/libllama_combined.a /build/lib/llama/libllama_linux_arm64.a
COPY --from=llama /out/*.h /build/lib/llama/   # ← b8157 headers land here...

COPY . .    # ← ...then this overwrites them with the repo's stale b7285 headers

This causes a struct size mismatch at link time: the static library is compiled from b8157, but the CGo bindings compile against the repo's lib/llama/llama.h at b7285. llama.cpp PR #17004 (merged Jan 4 2026, first included in b8157) added samplers/n_samplers fields to llama_context_params. With the truncated b7285 struct, create_gen_context passes an undersized struct by value to llama_init_from_model. The library reads garbage for the new samplers field, fails the validation assert, prints the error, and segfaults. This is why 1.0.11 (built against an older llama.cpp pre-PR#17004) was unaffected.

Fix

Split the llama COPY — keep the static library before COPY . . for Docker layer cache efficiency, move the header copy to after COPY . . so the correct versioned headers always take precedence over whatever is committed to the repo:

# Fixed order
COPY --from=llama /out/libllama_combined.a /build/lib/llama/libllama_linux_arm64.a
COPY go.mod go.sum ./
RUN go mod download
COPY . .
COPY --from=ui /ui/dist ./ui/dist
# Headers AFTER COPY . . — b8157 headers override stale repo headers
COPY --from=llama /out/*.h /build/lib/llama/

Also added header output to the vulkan-heimdall llama-builder stage, which never copied headers at all (silently relying entirely on the stale repo headers).

Affected Dockerfiles: arm64-metal, arm64-metal-heimdall, amd64-cuda, amd64-cuda-heimdall, amd64-vulkan-heimdall, cpu-bge

Verification

Built from this branch and ran the fixed image on the same M1 Pro machine:

🛡️ Loading Heimdall model: /app/models/qwen3-0.6b-instruct.gguf
✅ SLM model loaded: qwen3-0.6b-instruct
✅ Search reranker ready: bge-reranker-v2-m3.gguf (Stage-2 reranking enabled)
✅ Heimdall AI Assistant ready → 6 actions available

$ curl http://localhost:7474/health
{"status":"healthy"}

No SIGSEGV. Full docker-compose stack (nornicdb + mimir) comes up healthy.

🤖 Generated with Claude Code

COPY . . overwrites the fresh llama.cpp b8157 headers (copied from the
build stage) with the stale b7285 headers committed to the repo. The
resulting struct size mismatch causes a SIGSEGV in create_gen_context
when llama_init_from_model reads garbage for the new samplers field
added in b8157.

Fix: move the header COPY to after COPY . . in all affected Dockerfiles
so the correct versioned headers always take precedence.

Fixes crash: "the backend samplers must be of type llama_sampler_chain"

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@orneryd
Copy link
Copy Markdown
Owner

orneryd commented Feb 28, 2026

Thanks for that! I somehow didn't see this on mac when i was testing so i appreciate the fix!

@orneryd orneryd merged commit b48bd09 into orneryd:main Feb 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants