epic: Introduce smart model management

## Problem Statement
Currently, Jan defaults to a context length of 8192 tokens. When `llama.cpp` preallocates its KV cache based on this fixed context length, it often consumes a significant portion of the system's available memory. This can lead to situations where the operating system has insufficient memory, resulting in excessive swapping, system unresponsiveness, and ultimately, hangs. This issue severely impacts the user experience, especially on devices with limited RAM or when running larger models.

---

## Feature Idea
The goal is to introduce **smart model management** that dynamically adjusts the context length based on the backend device's available memory. This will ensure that even after `llama.cpp` preallocates its KV cache, there's enough memory remaining for the OS, preventing system hangs and improving overall stability.

The proposed solution involves:
1.  **Memory Detection:** Accurately determine the total and available memory of the backend device.
2.  **Dynamic Context Length Calculation:** Based on the detected memory, calculate an optimal context length that fits within a user-defined memory usage threshold.
3.  **Configurable Memory Usage Settings:** Provide users with pre-defined settings to control memory allocation:
    * **Balanced:** Utilizes approximately 60% of the available memory for the model's context.
    * **High:** Pushes memory usage up to approximately 90% of the available memory, suitable for users prioritizing maximum context length.
    * **Low:** Restricts memory usage to less than 50% of the available memory, ideal for systems with limited resources or users running other demanding applications.
4.  **Model Loading Error Handling:** If a model, even with the "Low" setting, is too large to fit within the calculated memory constraints, the system should display a clear "Model cannot be loaded" error message, preventing potential crashes.

This feature will enhance stability, provide a smoother user experience, and allow for more efficient utilization of system resources.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

epic: Introduce smart model management #6000

Problem Statement

Feature Idea

Sub-issues

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

epic: Introduce smart model management #6000

Description

Problem Statement

Feature Idea

Sub-issues

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions