diff --git a/docs/docs/specs/engineering/engine.md b/docs/docs/specs/engineering/engine.md new file mode 100644 index 0000000000..d25fdfc041 --- /dev/null +++ b/docs/docs/specs/engineering/engine.md @@ -0,0 +1,60 @@ +--- +title: Engine +slug: /specs/engine +--- + +:::caution + +Currently Under Development + +::: + +## Overview + +In the Jan application, engines serve as primary entities with the following capabilities: + +- Engine will be installed through `inference-extensions`. +- Models will depend on engines to do [inference](https://en.wikipedia.org/wiki/Inference_engine). +- Engine configuration and required metadata will be stored in a json file. + +## Folder Structure + +- Default parameters for engines are stored in JSON files located in the `/engines` folder. +- These parameter files are named uniquely with `engine_id`. +- Engines are referenced directly using `engine_id` in the `model.json` file. + +```yaml +jan/ + engines/ + nitro.json + openai.json + ..... +``` + +## Engine Default Parameter Files + +- Each inference engine requires default parameters to function in cases where user-provided parameters are absent. +- These parameters are stored in JSON files, structured as simple key-value pairs. + +### Example + +Here is an example of an engine file for `engine_id` `nitro`: + +```js +{ + "ctx_len": 512, + "ngl": 100, + "embedding": false, + "n_parallel": 1, + "cont_batching": false + "prompt_template": "<|im_start|>system\n{system_message}<|im_end|>\n<|im_start|>user\n{prompt}<|im_end|>\n<|im_start|>assistant" +} +``` + +For detailed engine parameters, refer to: [Nitro's Model Settings](https://nitro.jan.ai/features/load-unload#table-of-parameters) + +## Adding an Engine + +- Engine parameter files are automatically generated upon installing an `inference-extension` in the Jan application. + +--- diff --git a/docs/docs/specs/engineering/models.md b/docs/docs/specs/engineering/models.md index decf8f5e98..9d97244e7b 100644 --- a/docs/docs/specs/engineering/models.md +++ b/docs/docs/specs/engineering/models.md @@ -53,9 +53,9 @@ jan/ # Jan root folder Here's a standard example `model.json` for a GGUF model. -- `source_url`: https://huggingface.co/TheBloke/zephyr-7B-beta-GGUF/. ```js +{ "id": "zephyr-7b", // Defaults to foldername "object": "model", // Defaults to "model" "source_url": "https://huggingface.co/TheBloke/zephyr-7B-beta-GGUF/blob/main/zephyr-7b-beta.Q4_K_M.gguf", @@ -64,15 +64,16 @@ Here's a standard example `model.json` for a GGUF model. "version": "1", // Defaults to 1 "created": 1231231, // Defaults to file creation time "description": null, // Defaults to null -"state": enum[null, "downloading", "ready", "starting", "stopping", ...] +"state": enum[null, ready"] "format": "ggufv3", // Defaults to "ggufv3" -"settings": { // Models are initialized with settings - "ctx_len": 2048, +"egine": "nitro", // engine_id specified in jan/engine folder +"engine_parameters": { // Engine parameters inside model.json can override + "ctx_len": 2048, // the value inside the base engine.json "ngl": 100, "embedding": true, "n_parallel": 4, }, -"parameters": { // Models are called parameters +"model_parameters": { // Models are called parameters "stream": true, "max_tokens": 2048, "stop": [""], // This usually can be left blank, only used with specific need from model author @@ -85,9 +86,10 @@ Here's a standard example `model.json` for a GGUF model. "assets": [ // Defaults to current dir "file://.../zephyr-7b-q4_k_m.bin", ] +} ``` -The model settings in the example can be found at: [Nitro's model settings](https://nitro.jan.ai/features/load-unload#table-of-parameters) +The engine parameters in the example can be found at: [Nitro's model settings](https://nitro.jan.ai/features/load-unload#table-of-parameters) The model parameters in the example can be found at: [Nitro's model parameters](https://nitro.jan.ai/api-reference#tag/Chat-Completion) diff --git a/docs/sidebars.js b/docs/sidebars.js index edef458cd7..384f47e9dd 100644 --- a/docs/sidebars.js +++ b/docs/sidebars.js @@ -81,6 +81,7 @@ const sidebars = { items: [ "specs/engineering/chats", "specs/engineering/models", + "specs/engineering/engine", "specs/engineering/threads", "specs/engineering/messages", "specs/engineering/assistants",