chore(model gallery): add qwen3-coder-30b-a3b-instruct based on model request (#8082)

rampa3 · web-flow · commit 897ad1729e4f · 2026-01-18T09:23:07.000+01:00
* chore(model gallery): add qwen3-coder-30b-a3b-instruct based on model request

Signed-off-by: rampa3 &lt;68955305+rampa3@users.noreply.github.com&gt;

* added missing model config import URL

Signed-off-by: rampa3 &lt;68955305+rampa3@users.noreply.github.com&gt;

---------

Signed-off-by: rampa3 &lt;68955305+rampa3@users.noreply.github.com&gt;
diff --git a/gallery/index.yaml b/gallery/index.yaml
@@ -3822,6 +3822,41 @@
     - filename: boomerang-qwen3-4.9B.Q4_K_M.gguf
       sha256: 11e6c068351d104dee31dd63550e5e2fc9be70467c1cfc07a6f84030cb701537
       uri: huggingface://mradermacher/boomerang-qwen3-4.9B-GGUF/boomerang-qwen3-4.9B.Q4_K_M.gguf
+- !!merge <<: *qwen3
+  name: "qwen3-coder-30b-a3b-instruct"
+  icon: https://cdn-avatars.huggingface.co/v1/production/uploads/620760a26e3b7210c2ff1943/-s1gyJfvbE1RgO5iBeNOi.png
+  url: "github:mudler/LocalAI/gallery/qwen3.yaml@master"
+  urls:
+    - https://huggingface.co/Qwen/Qwen3-Coder-30B-A3B-Instruct
+    - https://huggingface.co/unsloth/Qwen3-Coder-30B-A3B-Instruct-GGUF
+  description: |
+    Qwen3-Coder is available in multiple sizes. Today, we're excited to introduce Qwen3-Coder-30B-A3B-Instruct. This streamlined model maintains impressive performance and efficiency, featuring the following key enhancements:
+
+        - Significant Performance among open models on Agentic Coding, Agentic Browser-Use, and other foundational coding tasks.
+        - Long-context Capabilities with native support for 256K tokens, extendable up to 1M tokens using Yarn, optimized for repository-scale understanding.
+        - Agentic Coding supporting for most platform such as Qwen Code, CLINE, featuring a specially designed function call format.
+
+
+    Model Overview:
+    Qwen3-Coder-30B-A3B-Instruct has the following features:
+
+        - Type: Causal Language Models
+        - Training Stage: Pretraining & Post-training
+        - Number of Parameters: 30.5B in total and 3.3B activated
+        - Number of Layers: 48
+        - Number of Attention Heads (GQA): 32 for Q and 4 for KV
+        - Number of Experts: 128
+        - Number of Activated Experts: 8
+        - Context Length: 262,144 natively.
+
+    NOTE: This model supports only non-thinking mode and does not generate <think></think> blocks in its output. Meanwhile, specifying enable_thinking=False is no longer required.
+  overrides:
+    parameters:
+      model: Qwen3-Coder-30B-A3B-Instruct-Q4_K_M.gguf
+  files:
+    - filename: Qwen3-Coder-30B-A3B-Instruct-Q4_K_M.gguf
+      sha256: fadc3e5f8d42bf7e894a785b05082e47daee4df26680389817e2093056f088ad
+      uri: huggingface://unsloth/Qwen3-Coder-30B-A3B-Instruct-GGUF/Qwen3-Coder-30B-A3B-Instruct-Q4_K_M.gguf
 - &gemma3
   url: "github:mudler/LocalAI/gallery/gemma.yaml@master"
   name: "gemma-3-27b-it"