Skip to content

fix(studio): build llama.cpp from master for Gemma 4 support#4790

Merged
danielhanchen merged 1 commit intomainfrom
fix/llama-cpp-master-tag
Apr 2, 2026
Merged

fix(studio): build llama.cpp from master for Gemma 4 support#4790
danielhanchen merged 1 commit intomainfrom
fix/llama-cpp-master-tag

Conversation

@danielhanchen
Copy link
Copy Markdown
Contributor

Summary

  • Default llama.cpp build target from latest release tag to master branch
  • The latest ggml-org/llama.cpp release (b8635) does not include Gemma 4 support -- model: support gemma 4 (vision + moe, no audio) ggml-org/llama.cpp#21309 merged after the release was cut
  • This causes llama-server to fail with unknown model architecture: 'gemma4' when loading any Gemma 4 GGUF
  • Temporary fix until the next upstream release includes Gemma 4

Changes

  • setup.sh: add _DEFAULT_LLAMA_TAG="master" maintainer-editable default
  • setup.ps1: add $DefaultLlamaTag="master" maintainer-editable default
  • install_llama_prebuilt.py: change DEFAULT_LLAMA_TAG fallback from "latest" to "master"

Users can still override via UNSLOTH_LLAMA_TAG env var. Once a new upstream release is cut with Gemma 4 support, revert these back to "latest".

Test plan

  • Fresh install via install.sh --local builds llama.cpp from master HEAD
  • Load Gemma 4 E2B GGUF in Studio -- verify it works (was failing with b8635)
  • Load Qwen3.5-4B GGUF -- verify non-Gemma models still work
  • UNSLOTH_LLAMA_TAG=b8635 override still works (builds from that tag)

The latest ggml-org/llama.cpp release (b8635) does not include Gemma 4
support (ggml-org/llama.cpp#21309 merged after the release was cut).
This causes `llama-server` to fail with "unknown model architecture:
gemma4" when loading Gemma 4 GGUFs.

Temporarily default _DEFAULT_LLAMA_TAG to "master" so all new installs
build from the llama.cpp master branch which includes Gemma 4 support.
Once a new upstream release is cut with Gemma 4, this can be reverted
back to "latest".

Changes:
- setup.sh: add _DEFAULT_LLAMA_TAG="master" maintainer default
- setup.ps1: add $DefaultLlamaTag="master" maintainer default
- install_llama_prebuilt.py: change DEFAULT_LLAMA_TAG fallback to "master"

Users can still override via UNSLOTH_LLAMA_TAG env var.
@danielhanchen danielhanchen merged commit 1ce83c4 into main Apr 2, 2026
5 checks passed
@danielhanchen danielhanchen deleted the fix/llama-cpp-master-tag branch April 2, 2026 16:45
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the default llama.cpp tag from "latest" to "master" across the Python, PowerShell, and Shell setup scripts. Feedback highlights that switching to "master" while using the ggml-org repository forces source builds for all users, leading to significant performance regressions and reduced installation stability. Additionally, there is a suggestion to improve environment variable handling in the Python script to robustly handle empty strings and a concern regarding the maintenance overhead of duplicating the default tag across multiple files.

# ──────────────────────────────────────────────────────────────────────────
_DEFAULT_LLAMA_PR_FORCE=""
_DEFAULT_LLAMA_SOURCE="https://github.com/ggml-org/llama.cpp"
_DEFAULT_LLAMA_TAG="master"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Setting the default tag to "master" while using ggml-org/llama.cpp as the helper repository effectively forces a source build for all users, as ggml-org does not host the Unsloth-specific manifest and checksum files required for prebuilt resolution. This results in a significant performance regression (5-10 minutes per setup run) because the scripts lack logic to skip the build if the existing installation is already at the correct commit. Furthermore, "master" is a moving target, making installations less reproducible and potentially unstable.

# --------------------------------------------------------------------------
$DefaultLlamaPrForce = ""
$DefaultLlamaSource = "https://github.com/ggml-org/llama.cpp"
$DefaultLlamaTag = "master"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Setting the default tag to "master" while using ggml-org/llama.cpp as the helper repository effectively forces a source build for all users, as ggml-org does not host the Unsloth-specific dummy manifest and checksum files required for prebuilt resolution. This results in a significant performance regression (5-10 minutes per setup run) because the scripts lack logic to skip the build if the existing installation is already at the correct commit. Furthermore, "master" is a moving target, making installations less reproducible and potentially unstable.



DEFAULT_LLAMA_TAG = os.environ.get("UNSLOTH_LLAMA_TAG", "latest")
DEFAULT_LLAMA_TAG = os.environ.get("UNSLOTH_LLAMA_TAG", "master")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The default tag is now duplicated across three files (install_llama_prebuilt.py, setup.sh, and setup.ps1), which increases maintenance overhead. Additionally, using os.environ.get(..., "master") will return an empty string if the environment variable is set but empty, which might lead to unexpected behavior in normalized_requested_llama_tag (which defaults to 'latest' for empty strings). Using or ensures that empty strings also fall back to the intended default value.

Suggested change
DEFAULT_LLAMA_TAG = os.environ.get("UNSLOTH_LLAMA_TAG", "master")
DEFAULT_LLAMA_TAG = os.environ.get("UNSLOTH_LLAMA_TAG") or "master"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant