|
5970 | 5970 | - filename: m1-32b.Q4_K_M.gguf |
5971 | 5971 | sha256: 1dfa3b6822447aca590d6f2881cf277bd0fbde633a39c5a20b521f4a59145e3f |
5972 | 5972 | uri: huggingface://mradermacher/m1-32b-GGUF/m1-32b.Q4_K_M.gguf |
| 5973 | +- !!merge <<: *qwen25 |
| 5974 | + name: "qwen2.5-14b-instruct-1m" |
| 5975 | + urls: |
| 5976 | + - https://huggingface.co/Qwen/Qwen2.5-14B-Instruct-1M |
| 5977 | + - https://huggingface.co/bartowski/Qwen2.5-14B-Instruct-1M-GGUF |
| 5978 | + description: | |
| 5979 | + Qwen2.5-1M is the long-context version of the Qwen2.5 series models, supporting a context length of up to 1M tokens. Compared to the Qwen2.5 128K version, Qwen2.5-1M demonstrates significantly improved performance in handling long-context tasks while maintaining its capability in short tasks. |
| 5980 | + |
| 5981 | + The model has the following features: |
| 5982 | + |
| 5983 | + Type: Causal Language Models |
| 5984 | + Training Stage: Pretraining & Post-training |
| 5985 | + Architecture: transformers with RoPE, SwiGLU, RMSNorm, and Attention QKV bias |
| 5986 | + Number of Parameters: 14.7B |
| 5987 | + Number of Paramaters (Non-Embedding): 13.1B |
| 5988 | + Number of Layers: 48 |
| 5989 | + Number of Attention Heads (GQA): 40 for Q and 8 for KV |
| 5990 | + Context Length: Full 1,010,000 tokens and generation 8192 tokens |
| 5991 | + We recommend deploying with our custom vLLM, which introduces sparse attention and length extrapolation methods to ensure efficiency and accuracy for long-context tasks. For specific guidance, refer to this section. |
| 5992 | + You can also use the previous framework that supports Qwen2.5 for inference, but accuracy degradation may occur for sequences exceeding 262,144 tokens. |
| 5993 | + |
| 5994 | + For more details, please refer to our blog, GitHub, Technical Report, and Documentation. |
| 5995 | + overrides: |
| 5996 | + parameters: |
| 5997 | + model: Qwen2.5-14B-Instruct-1M-Q4_K_M.gguf |
| 5998 | + files: |
| 5999 | + - filename: Qwen2.5-14B-Instruct-1M-Q4_K_M.gguf |
| 6000 | + sha256: a1a0fa3e2c3f9d63f9202af9172cffbc0b519801dff740fffd39f6a063a731ef |
| 6001 | + uri: huggingface://bartowski/Qwen2.5-14B-Instruct-1M-GGUF/Qwen2.5-14B-Instruct-1M-Q4_K_M.gguf |
5973 | 6002 | - &llama31 |
5974 | 6003 | url: "github:mudler/LocalAI/gallery/llama3.1-instruct.yaml@master" ## LLama3.1 |
5975 | 6004 | icon: https://avatars.githubusercontent.com/u/153379578 |
|
0 commit comments