Skip to content

Commit e3717e5

Browse files
authored
chore(model gallery): add qwen2.5-14b-instruct-1m (#5201)
Signed-off-by: Ettore Di Giacinto <[email protected]>
1 parent c8f6858 commit e3717e5

File tree

1 file changed

+29
-0
lines changed

1 file changed

+29
-0
lines changed

gallery/index.yaml

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5970,6 +5970,35 @@
59705970
- filename: m1-32b.Q4_K_M.gguf
59715971
sha256: 1dfa3b6822447aca590d6f2881cf277bd0fbde633a39c5a20b521f4a59145e3f
59725972
uri: huggingface://mradermacher/m1-32b-GGUF/m1-32b.Q4_K_M.gguf
5973+
- !!merge <<: *qwen25
5974+
name: "qwen2.5-14b-instruct-1m"
5975+
urls:
5976+
- https://huggingface.co/Qwen/Qwen2.5-14B-Instruct-1M
5977+
- https://huggingface.co/bartowski/Qwen2.5-14B-Instruct-1M-GGUF
5978+
description: |
5979+
Qwen2.5-1M is the long-context version of the Qwen2.5 series models, supporting a context length of up to 1M tokens. Compared to the Qwen2.5 128K version, Qwen2.5-1M demonstrates significantly improved performance in handling long-context tasks while maintaining its capability in short tasks.
5980+
5981+
The model has the following features:
5982+
5983+
Type: Causal Language Models
5984+
Training Stage: Pretraining & Post-training
5985+
Architecture: transformers with RoPE, SwiGLU, RMSNorm, and Attention QKV bias
5986+
Number of Parameters: 14.7B
5987+
Number of Paramaters (Non-Embedding): 13.1B
5988+
Number of Layers: 48
5989+
Number of Attention Heads (GQA): 40 for Q and 8 for KV
5990+
Context Length: Full 1,010,000 tokens and generation 8192 tokens
5991+
We recommend deploying with our custom vLLM, which introduces sparse attention and length extrapolation methods to ensure efficiency and accuracy for long-context tasks. For specific guidance, refer to this section.
5992+
You can also use the previous framework that supports Qwen2.5 for inference, but accuracy degradation may occur for sequences exceeding 262,144 tokens.
5993+
5994+
For more details, please refer to our blog, GitHub, Technical Report, and Documentation.
5995+
overrides:
5996+
parameters:
5997+
model: Qwen2.5-14B-Instruct-1M-Q4_K_M.gguf
5998+
files:
5999+
- filename: Qwen2.5-14B-Instruct-1M-Q4_K_M.gguf
6000+
sha256: a1a0fa3e2c3f9d63f9202af9172cffbc0b519801dff740fffd39f6a063a731ef
6001+
uri: huggingface://bartowski/Qwen2.5-14B-Instruct-1M-GGUF/Qwen2.5-14B-Instruct-1M-Q4_K_M.gguf
59736002
- &llama31
59746003
url: "github:mudler/LocalAI/gallery/llama3.1-instruct.yaml@master" ## LLama3.1
59756004
icon: https://avatars.githubusercontent.com/u/153379578

0 commit comments

Comments
 (0)