Skip to content

Commit 4e779e6

Browse files
committed
Add backend to CI, update AGENTS.md from this exercise
Signed-off-by: Ettore Di Giacinto <[email protected]>
1 parent 7137c75 commit 4e779e6

File tree

4 files changed

+245
-2
lines changed

4 files changed

+245
-2
lines changed

.github/workflows/backend.yml

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -78,6 +78,19 @@ jobs:
7878
dockerfile: "./backend/Dockerfile.python"
7979
context: "./"
8080
ubuntu-version: '2404'
81+
- build-type: ''
82+
cuda-major-version: ""
83+
cuda-minor-version: ""
84+
platforms: 'linux/amd64'
85+
tag-latest: 'auto'
86+
tag-suffix: '-cpu-moonshine'
87+
runs-on: 'ubuntu-latest'
88+
base-image: "ubuntu:24.04"
89+
skip-drivers: 'true'
90+
backend: "moonshine"
91+
dockerfile: "./backend/Dockerfile.python"
92+
context: "./"
93+
ubuntu-version: '2404'
8194
# CUDA 12 builds
8295
- build-type: 'cublas'
8396
cuda-major-version: "12"
@@ -222,6 +235,19 @@ jobs:
222235
dockerfile: "./backend/Dockerfile.python"
223236
context: "./"
224237
ubuntu-version: '2404'
238+
- build-type: 'cublas'
239+
cuda-major-version: "12"
240+
cuda-minor-version: "9"
241+
platforms: 'linux/amd64'
242+
tag-latest: 'auto'
243+
tag-suffix: '-gpu-nvidia-cuda-12-moonshine'
244+
runs-on: 'ubuntu-latest'
245+
base-image: "ubuntu:24.04"
246+
skip-drivers: 'false'
247+
backend: "moonshine"
248+
dockerfile: "./backend/Dockerfile.python"
249+
context: "./"
250+
ubuntu-version: '2404'
225251
- build-type: 'cublas'
226252
cuda-major-version: "12"
227253
cuda-minor-version: "9"
@@ -444,6 +470,19 @@ jobs:
444470
dockerfile: "./backend/Dockerfile.python"
445471
context: "./"
446472
ubuntu-version: '2404'
473+
- build-type: 'cublas'
474+
cuda-major-version: "13"
475+
cuda-minor-version: "0"
476+
platforms: 'linux/amd64'
477+
tag-latest: 'auto'
478+
tag-suffix: '-gpu-nvidia-cuda-13-moonshine'
479+
runs-on: 'ubuntu-latest'
480+
base-image: "ubuntu:24.04"
481+
skip-drivers: 'false'
482+
backend: "moonshine"
483+
dockerfile: "./backend/Dockerfile.python"
484+
context: "./"
485+
ubuntu-version: '2404'
447486
- build-type: 'cublas'
448487
cuda-major-version: "13"
449488
cuda-minor-version: "0"

AGENTS.md

Lines changed: 144 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,150 @@ Let's say the user wants to build a particular backend for a given platform. For
1515
- The user may say they want to build AMD or ROCM instead of hipblas, or Intel instead of SYCL or NVIDIA insted of l4t or cublas. Ask for confirmation if there is ambiguity.
1616
- Sometimes the user may need extra parameters to be added to `docker build` (e.g. `--platform` for cross-platform builds or `--progress` to view the full logs), in which case you can generate the `docker build` command directly.
1717

18+
## Adding a New Backend
19+
20+
When adding a new backend to LocalAI, you need to update several files to ensure the backend is properly built, tested, and registered. Here's a step-by-step guide based on the pattern used for adding backends like `moonshine`:
21+
22+
### 1. Create Backend Directory Structure
23+
24+
Create the backend directory under the appropriate location:
25+
- **Python backends**: `backend/python/<backend-name>/`
26+
- **Go backends**: `backend/go/<backend-name>/`
27+
- **C++ backends**: `backend/cpp/<backend-name>/`
28+
29+
For Python backends, you'll typically need:
30+
- `backend.py` - Main gRPC server implementation
31+
- `Makefile` - Build configuration
32+
- `install.sh` - Installation script for dependencies
33+
- `protogen.sh` - Protocol buffer generation script
34+
- `requirements.txt` - Python dependencies
35+
- `run.sh` - Runtime script
36+
- `test.py` / `test.sh` - Test files
37+
38+
### 2. Add Build Configurations to `.github/workflows/backend.yml`
39+
40+
Add build matrix entries for each platform/GPU type you want to support. Look at similar backends (e.g., `chatterbox`, `faster-whisper`) for reference.
41+
42+
**Placement in file:**
43+
- CPU builds: Add after other CPU builds (e.g., after `cpu-chatterbox`)
44+
- CUDA 12 builds: Add after other CUDA 12 builds (e.g., after `gpu-nvidia-cuda-12-chatterbox`)
45+
- CUDA 13 builds: Add after other CUDA 13 builds (e.g., after `gpu-nvidia-cuda-13-chatterbox`)
46+
47+
**Additional build types you may need:**
48+
- ROCm/HIP: Use `build-type: 'hipblas'` with `base-image: "rocm/dev-ubuntu-24.04:6.4.4"`
49+
- Intel/SYCL: Use `build-type: 'intel'` or `build-type: 'sycl_f16'`/`sycl_f32` with `base-image: "intel/oneapi-basekit:2025.3.0-0-devel-ubuntu24.04"`
50+
- L4T (ARM): Use `build-type: 'l4t'` with `platforms: 'linux/arm64'` and `runs-on: 'ubuntu-24.04-arm'`
51+
52+
### 3. Add Backend Metadata to `backend/index.yaml`
53+
54+
**Step 3a: Add Meta Definition**
55+
56+
Add a YAML anchor definition in the `## metas` section (around line 2-300). Look for similar backends to use as a template such as `diffusers` or `chatterbox`
57+
58+
**Step 3b: Add Image Entries**
59+
60+
Add image entries at the end of the file, following the pattern of similar backends such as `diffusers` or `chatterbox`. Include both `latest` (production) and `master` (development) tags.
61+
62+
### 4. Update the Makefile
63+
64+
The Makefile needs to be updated in several places to support building and testing the new backend:
65+
66+
**Step 4a: Add to `.NOTPARALLEL`**
67+
68+
Add `backends/<backend-name>` to the `.NOTPARALLEL` line (around line 2) to prevent parallel execution conflicts:
69+
70+
```makefile
71+
.NOTPARALLEL: ... backends/<backend-name>
72+
```
73+
74+
**Step 4b: Add to `prepare-test-extra`**
75+
76+
Add the backend to the `prepare-test-extra` target (around line 312) to prepare it for testing:
77+
78+
```makefile
79+
prepare-test-extra: protogen-python
80+
...
81+
$(MAKE) -C backend/python/<backend-name>
82+
```
83+
84+
**Step 4c: Add to `test-extra`**
85+
86+
Add the backend to the `test-extra` target (around line 319) to run its tests:
87+
88+
```makefile
89+
test-extra: prepare-test-extra
90+
...
91+
$(MAKE) -C backend/python/<backend-name> test
92+
```
93+
94+
**Step 4d: Add Backend Definition**
95+
96+
Add a backend definition variable in the backend definitions section (around line 428-457). The format depends on the backend type:
97+
98+
**For Python backends with root context** (like `faster-whisper`, `bark`):
99+
```makefile
100+
BACKEND_<BACKEND_NAME> = <backend-name>|python|.|false|true
101+
```
102+
103+
**For Python backends with `./backend` context** (like `chatterbox`, `moonshine`):
104+
```makefile
105+
BACKEND_<BACKEND_NAME> = <backend-name>|python|./backend|false|true
106+
```
107+
108+
**For Go backends**:
109+
```makefile
110+
BACKEND_<BACKEND_NAME> = <backend-name>|golang|.|false|true
111+
```
112+
113+
**Step 4e: Generate Docker Build Target**
114+
115+
Add an eval call to generate the docker-build target (around line 480-501):
116+
117+
```makefile
118+
$(eval $(call generate-docker-build-target,$(BACKEND_<BACKEND_NAME>)))
119+
```
120+
121+
**Step 4f: Add to `docker-build-backends`**
122+
123+
Add `docker-build-<backend-name>` to the `docker-build-backends` target (around line 507):
124+
125+
```makefile
126+
docker-build-backends: ... docker-build-<backend-name>
127+
```
128+
129+
**Determining the Context:**
130+
131+
- If the backend is in `backend/python/<backend-name>/` and uses `./backend` as context in the workflow file, use `./backend` context
132+
- If the backend is in `backend/python/<backend-name>/` but uses `.` as context in the workflow file, use `.` context
133+
- Check similar backends to determine the correct context
134+
135+
### 5. Verification Checklist
136+
137+
After adding a new backend, verify:
138+
139+
- [ ] Backend directory structure is complete with all necessary files
140+
- [ ] Build configurations added to `.github/workflows/backend.yml` for all desired platforms
141+
- [ ] Meta definition added to `backend/index.yaml` in the `## metas` section
142+
- [ ] Image entries added to `backend/index.yaml` for all build variants (latest + development)
143+
- [ ] Tag suffixes match between workflow file and index.yaml
144+
- [ ] Makefile updated with all 6 required changes (`.NOTPARALLEL`, `prepare-test-extra`, `test-extra`, backend definition, docker-build target eval, `docker-build-backends`)
145+
- [ ] No YAML syntax errors (check with linter)
146+
- [ ] No Makefile syntax errors (check with linter)
147+
- [ ] Follows the same pattern as similar backends (e.g., if it's a transcription backend, follow `faster-whisper` pattern)
148+
149+
### 6. Example: Adding a Python Backend
150+
151+
For reference, when `moonshine` was added:
152+
- **Files created**: `backend/python/moonshine/{backend.py, Makefile, install.sh, protogen.sh, requirements.txt, run.sh, test.py, test.sh}`
153+
- **Workflow entries**: 3 build configurations (CPU, CUDA 12, CUDA 13)
154+
- **Index entries**: 1 meta definition + 6 image entries (cpu, cuda12, cuda13 × latest/development)
155+
- **Makefile updates**:
156+
- Added to `.NOTPARALLEL` line
157+
- Added to `prepare-test-extra` and `test-extra` targets
158+
- Added `BACKEND_MOONSHINE = moonshine|python|./backend|false|true`
159+
- Added eval for docker-build target generation
160+
- Added `docker-build-moonshine` to `docker-build-backends`
161+
18162
# Coding style
19163

20164
- The project has the following .editorconfig

Makefile

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
# Disable parallel execution for backend builds
2-
.NOTPARALLEL: backends/diffusers backends/llama-cpp backends/piper backends/stablediffusion-ggml backends/whisper backends/faster-whisper backends/silero-vad backends/local-store backends/huggingface backends/rfdetr backends/kitten-tts backends/kokoro backends/chatterbox backends/llama-cpp-darwin backends/neutts build-darwin-python-backend build-darwin-go-backend backends/mlx backends/diffuser-darwin backends/mlx-vlm backends/mlx-audio backends/stablediffusion-ggml-darwin backends/vllm
2+
.NOTPARALLEL: backends/diffusers backends/llama-cpp backends/piper backends/stablediffusion-ggml backends/whisper backends/faster-whisper backends/silero-vad backends/local-store backends/huggingface backends/rfdetr backends/kitten-tts backends/kokoro backends/chatterbox backends/llama-cpp-darwin backends/neutts build-darwin-python-backend build-darwin-go-backend backends/mlx backends/diffuser-darwin backends/mlx-vlm backends/mlx-audio backends/stablediffusion-ggml-darwin backends/vllm backends/moonshine
33

44
GOCMD=go
55
GOTEST=$(GOCMD) test
@@ -315,13 +315,15 @@ prepare-test-extra: protogen-python
315315
$(MAKE) -C backend/python/chatterbox
316316
$(MAKE) -C backend/python/vllm
317317
$(MAKE) -C backend/python/vibevoice
318+
$(MAKE) -C backend/python/moonshine
318319

319320
test-extra: prepare-test-extra
320321
$(MAKE) -C backend/python/transformers test
321322
$(MAKE) -C backend/python/diffusers test
322323
$(MAKE) -C backend/python/chatterbox test
323324
$(MAKE) -C backend/python/vllm test
324325
$(MAKE) -C backend/python/vibevoice test
326+
$(MAKE) -C backend/python/moonshine test
325327

326328
DOCKER_IMAGE?=local-ai
327329
DOCKER_AIO_IMAGE?=local-ai-aio
@@ -455,6 +457,7 @@ BACKEND_VLLM = vllm|python|./backend|false|true
455457
BACKEND_DIFFUSERS = diffusers|python|./backend|--progress=plain|true
456458
BACKEND_CHATTERBOX = chatterbox|python|./backend|false|true
457459
BACKEND_VIBEVOICE = vibevoice|python|./backend|--progress=plain|true
460+
BACKEND_MOONSHINE = moonshine|python|./backend|false|true
458461

459462
# Helper function to build docker image for a backend
460463
# Usage: $(call docker-build-backend,BACKEND_NAME,DOCKERFILE_TYPE,BUILD_CONTEXT,PROGRESS_FLAG,NEEDS_BACKEND_ARG)
@@ -499,12 +502,13 @@ $(eval $(call generate-docker-build-target,$(BACKEND_VLLM)))
499502
$(eval $(call generate-docker-build-target,$(BACKEND_DIFFUSERS)))
500503
$(eval $(call generate-docker-build-target,$(BACKEND_CHATTERBOX)))
501504
$(eval $(call generate-docker-build-target,$(BACKEND_VIBEVOICE)))
505+
$(eval $(call generate-docker-build-target,$(BACKEND_MOONSHINE)))
502506

503507
# Pattern rule for docker-save targets
504508
docker-save-%: backend-images
505509
docker save local-ai-backend:$* -o backend-images/$*.tar
506510

507-
docker-build-backends: docker-build-llama-cpp docker-build-rerankers docker-build-vllm docker-build-transformers docker-build-diffusers docker-build-kokoro docker-build-faster-whisper docker-build-coqui docker-build-bark docker-build-chatterbox docker-build-vibevoice docker-build-exllama2
511+
docker-build-backends: docker-build-llama-cpp docker-build-rerankers docker-build-vllm docker-build-transformers docker-build-diffusers docker-build-kokoro docker-build-faster-whisper docker-build-coqui docker-build-bark docker-build-chatterbox docker-build-vibevoice docker-build-exllama2 docker-build-moonshine
508512

509513
########################################################
510514
### END Backends

backend/index.yaml

Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -275,6 +275,24 @@
275275
amd: "rocm-faster-whisper"
276276
nvidia-cuda-13: "cuda13-faster-whisper"
277277
nvidia-cuda-12: "cuda12-faster-whisper"
278+
- &moonshine
279+
description: |
280+
Moonshine is a fast, accurate, and efficient speech-to-text transcription model using ONNX Runtime.
281+
It provides real-time transcription capabilities with support for multiple model sizes and GPU acceleration.
282+
urls:
283+
- https://github.com/moonshine-ai/moonshine
284+
tags:
285+
- speech-to-text
286+
- transcription
287+
- ONNX
288+
license: MIT
289+
name: "moonshine"
290+
alias: "moonshine"
291+
capabilities:
292+
nvidia: "cuda12-moonshine"
293+
default: "cpu-moonshine"
294+
nvidia-cuda-13: "cuda13-moonshine"
295+
nvidia-cuda-12: "cuda12-moonshine"
278296
- &kokoro
279297
icon: https://avatars.githubusercontent.com/u/166769057?v=4
280298
description: |
@@ -1315,6 +1333,44 @@
13151333
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-nvidia-cuda-13-faster-whisper"
13161334
mirrors:
13171335
- localai/localai-backends:master-gpu-nvidia-cuda-13-faster-whisper
1336+
## moonshine
1337+
- !!merge <<: *moonshine
1338+
name: "moonshine-development"
1339+
capabilities:
1340+
nvidia: "cuda12-moonshine-development"
1341+
default: "cpu-moonshine-development"
1342+
nvidia-cuda-13: "cuda13-moonshine-development"
1343+
nvidia-cuda-12: "cuda12-moonshine-development"
1344+
- !!merge <<: *moonshine
1345+
name: "cpu-moonshine"
1346+
uri: "quay.io/go-skynet/local-ai-backends:latest-cpu-moonshine"
1347+
mirrors:
1348+
- localai/localai-backends:latest-cpu-moonshine
1349+
- !!merge <<: *moonshine
1350+
name: "cpu-moonshine-development"
1351+
uri: "quay.io/go-skynet/local-ai-backends:master-cpu-moonshine"
1352+
mirrors:
1353+
- localai/localai-backends:master-cpu-moonshine
1354+
- !!merge <<: *moonshine
1355+
name: "cuda12-moonshine"
1356+
uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-nvidia-cuda-12-moonshine"
1357+
mirrors:
1358+
- localai/localai-backends:latest-gpu-nvidia-cuda-12-moonshine
1359+
- !!merge <<: *moonshine
1360+
name: "cuda12-moonshine-development"
1361+
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-nvidia-cuda-12-moonshine"
1362+
mirrors:
1363+
- localai/localai-backends:master-gpu-nvidia-cuda-12-moonshine
1364+
- !!merge <<: *moonshine
1365+
name: "cuda13-moonshine"
1366+
uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-nvidia-cuda-13-moonshine"
1367+
mirrors:
1368+
- localai/localai-backends:latest-gpu-nvidia-cuda-13-moonshine
1369+
- !!merge <<: *moonshine
1370+
name: "cuda13-moonshine-development"
1371+
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-nvidia-cuda-13-moonshine"
1372+
mirrors:
1373+
- localai/localai-backends:master-gpu-nvidia-cuda-13-moonshine
13181374
## coqui
13191375

13201376
- !!merge <<: *coqui

0 commit comments

Comments
 (0)