Skip to content

Conversation

@qnixsynapse
Copy link
Contributor

@qnixsynapse qnixsynapse commented Jul 20, 2025

Extend the load() method in the llama.cpp extension to accept optional overrideSettings, allowing fine-grained per-model configuration.

This enables users to override provider-level settings such as ctx_size, chat_template, n_gpu_layers, etc., when loading a specific model.

Fixes: #5818 (Feature Request - Jan v0.6.6)

Use cases enabled:

  • Different context sizes per model (e.g., 4K vs 32K)
  • Model-specific chat templates (ChatML, Alpaca, etc.)
  • Performance tuning (threads, GPU layers)
  • Better memory management per deployment

Maintains full backward compatibility with existing provider config.


Important

Extend load() in index.ts to support per-model configuration overrides, enabling settings like ctx_size and chat_template for individual models.

  • Behavior:
    • Extend load() in index.ts to accept overrideSettings for per-model configuration.
    • Allows overriding ctx_size, chat_template, n_gpu_layers, etc., for specific models.
    • Maintains backward compatibility with existing provider config.
  • Use Cases:
    • Supports different context sizes per model (e.g., 4K vs 32K).
    • Enables model-specific chat templates (ChatML, Alpaca, etc.).
    • Allows performance tuning (threads, GPU layers).
    • Improves memory management per deployment.

This description was created by Ellipsis for a0d5c24. You can customize this summary. It will automatically update as commits are pushed.

Extend the `load()` method in the llama.cpp extension to accept optional
`overrideSettings`, allowing fine-grained per-model configuration.

This enables users to override provider-level settings such as `ctx_size`,
`chat_template`, `n_gpu_layers`, etc., when loading a specific model.

Fixes: #5818 (Feature Request - Jan v0.6.6)

Use cases enabled:
- Different context sizes per model (e.g., 4K vs 32K)
- Model-specific chat templates (ChatML, Alpaca, etc.)
- Performance tuning (threads, GPU layers)
- Better memory management per deployment

Maintains full backward compatibility with existing provider config.
Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

Looks good to me! 👍

Reviewed everything up to 41d2d8d in 49 seconds. Click for details.
  • Reviewed 23 lines of code in 1 files
  • Skipped 0 files when reviewing.
  • Skipped posting 2 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. extensions/llamacpp-extension/src/index.ts:762
  • Draft comment:
    Add a JSDoc comment for 'overrideSettings' to explain its purpose and expected properties.
  • Reason this comment was not posted:
    Confidence changes required: 0% <= threshold 50% None
2. extensions/llamacpp-extension/src/index.ts:777
  • Draft comment:
    Shallow merge of config and overrideSettings works now; ensure this is sufficient if nested objects are added in the future.
  • Reason this comment was not posted:
    Confidence changes required: 0% <= threshold 50% None

Workflow ID: wflow_V5h26q9OTWgClQoQ

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

Copy link
Contributor

@louis-jan louis-jan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions
Copy link
Contributor

github-actions bot commented Jul 20, 2025

Barecheck - Code coverage report

Total: 35%

Your code coverage diff: 0.00% ▴

✅ All code changes are covered

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

Looks good to me! 👍

Reviewed a0d5c24 in 1 minute and 39 seconds. Click for details.
  • Reviewed 14 lines of code in 1 files
  • Skipped 0 files when reviewing.
  • Skipped posting 2 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. extensions/llamacpp-extension/src/index.ts:760
  • Draft comment:
    Changing the order of parameters in load() may break existing calls using positional arguments. For example, a call like load('model', true) now treats true as overrideSettings. Consider using an options object or overloading to ensure backward compatibility.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 30% vs. threshold = 50% This is a valid concern about API compatibility. The parameter reordering could silently break existing code that uses positional arguments, causing runtime errors where boolean values get misinterpreted as settings objects. However, I need to consider if this is something that needs to be flagged in a PR comment. The comment suggests using an options object or overloading, but doesn't consider that this might be an intentional breaking change. Also, we don't know if there are any actual callers using positional arguments. While the technical concern is valid, breaking changes are sometimes necessary and intentional. The author may have already considered the impact and planned for it. The comment raises a valid concern about a breaking API change, but since we don't have strong evidence that this will cause actual problems, and breaking changes may be intentional, we should err on the side of trusting the author.
2. extensions/llamacpp-extension/src/index.ts:767
  • Draft comment:
    Typographical note: The error message at line 767 uses double exclamation marks ('Model already loaded!!'). Consider using a single exclamation mark for a more conventional tone.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.

Workflow ID: wflow_rIrPaeFPfzRsO5aa

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

@qnixsynapse qnixsynapse moved this to Needs Review in Jan Jul 20, 2025
@qnixsynapse qnixsynapse merged commit 81d6ed3 into release/v0.6.6 Jul 21, 2025
47 of 54 checks passed
@qnixsynapse qnixsynapse deleted the feat/per-model-configuration branch July 21, 2025 03:29
@github-project-automation github-project-automation bot moved this from Needs Review to QA in Jan Jul 21, 2025
@github-actions github-actions bot added this to the v0.6.6 milestone Jul 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

3 participants