feat: support per-model overrides in llama.cpp load() #5820

qnixsynapse · 2025-07-20T15:00:10Z

Extend the load() method in the llama.cpp extension to accept optional overrideSettings, allowing fine-grained per-model configuration.

This enables users to override provider-level settings such as ctx_size, chat_template, n_gpu_layers, etc., when loading a specific model.

Fixes: #5818 (Feature Request - Jan v0.6.6)

Use cases enabled:

Different context sizes per model (e.g., 4K vs 32K)
Model-specific chat templates (ChatML, Alpaca, etc.)
Performance tuning (threads, GPU layers)
Better memory management per deployment

Maintains full backward compatibility with existing provider config.

Important

Extend load() in index.ts to support per-model configuration overrides, enabling settings like ctx_size and chat_template for individual models.

Behavior:
- Extend load() in index.ts to accept overrideSettings for per-model configuration.
- Allows overriding ctx_size, chat_template, n_gpu_layers, etc., for specific models.
- Maintains backward compatibility with existing provider config.
Use Cases:
- Supports different context sizes per model (e.g., 4K vs 32K).
- Enables model-specific chat templates (ChatML, Alpaca, etc.).
- Allows performance tuning (threads, GPU layers).
- Improves memory management per deployment.

^{This description was created by}^{for a0d5c24. You can customize this summary. It will automatically update as commits are pushed.}

Extend the `load()` method in the llama.cpp extension to accept optional `overrideSettings`, allowing fine-grained per-model configuration. This enables users to override provider-level settings such as `ctx_size`, `chat_template`, `n_gpu_layers`, etc., when loading a specific model. Fixes: #5818 (Feature Request - Jan v0.6.6) Use cases enabled: - Different context sizes per model (e.g., 4K vs 32K) - Model-specific chat templates (ChatML, Alpaca, etc.) - Performance tuning (threads, GPU layers) - Better memory management per deployment Maintains full backward compatibility with existing provider config.

ellipsis-dev

Important

Looks good to me! 👍

Reviewed everything up to 41d2d8d in 49 seconds. Click for details.

Reviewed 23 lines of code in 1 files
Skipped 0 files when reviewing.
Skipped posting 2 draft comments. View those below.
Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.

1. extensions/llamacpp-extension/src/index.ts:762

Draft comment:
Add a JSDoc comment for 'overrideSettings' to explain its purpose and expected properties.
Reason this comment was not posted:
Confidence changes required: 0% <= threshold 50% None

2. extensions/llamacpp-extension/src/index.ts:777

Draft comment:
Shallow merge of config and overrideSettings works now; ensure this is sufficient if nested objects are added in the future.
Reason this comment was not posted:
Confidence changes required: 0% <= threshold 50% None

Workflow ID: wflow_V5h26q9OTWgClQoQ

^{You can customize}^{by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.}

louis-jan

LGTM

github-actions · 2025-07-20T15:04:57Z

Barecheck - Code coverage report

Total: 35%

Your code coverage diff: 0.00% ▴

✅ All code changes are covered

ellipsis-dev

Important

Looks good to me! 👍

Reviewed a0d5c24 in 1 minute and 39 seconds. Click for details.

Reviewed 14 lines of code in 1 files
Skipped 0 files when reviewing.
Skipped posting 2 draft comments. View those below.
Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.

1. extensions/llamacpp-extension/src/index.ts:760

Draft comment:
Changing the order of parameters in load() may break existing calls using positional arguments. For example, a call like load('model', true) now treats true as overrideSettings. Consider using an options object or overloading to ensure backward compatibility.
Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 30% vs. threshold = 50% This is a valid concern about API compatibility. The parameter reordering could silently break existing code that uses positional arguments, causing runtime errors where boolean values get misinterpreted as settings objects. However, I need to consider if this is something that needs to be flagged in a PR comment. The comment suggests using an options object or overloading, but doesn't consider that this might be an intentional breaking change. Also, we don't know if there are any actual callers using positional arguments. While the technical concern is valid, breaking changes are sometimes necessary and intentional. The author may have already considered the impact and planned for it. The comment raises a valid concern about a breaking API change, but since we don't have strong evidence that this will cause actual problems, and breaking changes may be intentional, we should err on the side of trusting the author.

2. extensions/llamacpp-extension/src/index.ts:767

Draft comment:
Typographical note: The error message at line 767 uses double exclamation marks ('Model already loaded!!'). Consider using a single exclamation mark for a more conventional tone.
Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.

Workflow ID: wflow_rIrPaeFPfzRsO5aa

^{You can customize}^{by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.}

qnixsynapse requested a review from louis-jan July 20, 2025 15:00

github-project-automation bot added this to Jan Jul 20, 2025

github-actions bot assigned qnixsynapse Jul 20, 2025

ellipsis-dev bot reviewed Jul 20, 2025

View reviewed changes

louis-jan approved these changes Jul 20, 2025

View reviewed changes

swap overrideSettings and isEmbedding argument

a0d5c24

ellipsis-dev bot reviewed Jul 20, 2025

View reviewed changes

qnixsynapse moved this to Needs Review in Jan Jul 20, 2025

qnixsynapse merged commit 81d6ed3 into release/v0.6.6 Jul 21, 2025
47 of 54 checks passed

qnixsynapse deleted the feat/per-model-configuration branch July 21, 2025 03:29

github-project-automation bot moved this from Needs Review to QA in Jan Jul 21, 2025

github-actions bot added this to the v0.6.6 milestone Jul 21, 2025

louis-jan mentioned this pull request Jul 29, 2025

Sync Release/v0.6.6 into dev #5973

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: support per-model overrides in llama.cpp load() #5820

feat: support per-model overrides in llama.cpp load() #5820

Uh oh!

qnixsynapse commented Jul 20, 2025 •

edited by ellipsis-dev bot

Loading

Uh oh!

ellipsis-dev bot left a comment

Uh oh!

louis-jan left a comment

Uh oh!

github-actions bot commented Jul 20, 2025 •

edited

Loading

Uh oh!

ellipsis-dev bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

feat: support per-model overrides in llama.cpp load() #5820

feat: support per-model overrides in llama.cpp load() #5820

Uh oh!

Conversation

qnixsynapse commented Jul 20, 2025 • edited by ellipsis-dev bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

Uh oh!

louis-jan left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Jul 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Barecheck - Code coverage report

Uh oh!

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

qnixsynapse commented Jul 20, 2025 •

edited by ellipsis-dev bot

Loading

github-actions bot commented Jul 20, 2025 •

edited

Loading