Skip to content

Commit 7a1e4f9

Browse files
mudlergithub-actions[bot]
authored andcommitted
chore(model gallery): 🤖 add new models via gallery agent
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
1 parent 3ce9cb5 commit 7a1e4f9

File tree

1 file changed

+27
-0
lines changed

1 file changed

+27
-0
lines changed

gallery/index.yaml

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -23006,3 +23006,30 @@
2300623006
- filename: nvidia.Qwen3-Nemotron-32B-RLBFF.Q4_K_M.gguf
2300723007
sha256: 5dfc9f1dc21885371b12a6e0857d86d6deb62b6601b4d439e4dfe01195a462f1
2300823008
uri: huggingface://DevQuasar/nvidia.Qwen3-Nemotron-32B-RLBFF-GGUF/nvidia.Qwen3-Nemotron-32B-RLBFF.Q4_K_M.gguf
23009+
- !!merge <<: *deepseek-r1
23010+
name: "r2mu-deepseek-r1-distill-llama-8b"
23011+
urls:
23012+
- https://huggingface.co/mradermacher/R2MU-DeepSeek-R1-Distill-Llama-8B-GGUF
23013+
description: |
23014+
The **R2MU-DeepSeek-R1-Distill-Llama-8B** is a distilled 8-billion-parameter language model based on the DeepSeek-R1 architecture, fine-tuned and optimized by the OPTML-Group. It leverages distillation techniques to retain strong performance while reducing size and inference cost, making it efficient for deployment on consumer hardware. The model is built upon the Llama-2 family of models and is trained on a broad range of text data to support diverse language tasks including reasoning, coding, and general conversation.
23015+
23016+
This particular version is available in multiple quantized formats (e.g., Q4_K_S, Q5_K_M, Q8_0) via the GGUF format, enabling efficient inference using tools like `llama.cpp`. The model is designed for use in local, low-resource environments without sacrificing too much accuracy.
23017+
23018+
**Key Features:**
23019+
- Base model: DeepSeek-R1-Distill-Llama-8B (originally by OPTML-Group)
23020+
- Architecture: Llama-style transformer, 8B parameters
23021+
- Distilled from a larger model to balance speed and performance
23022+
- Quantized versions available for efficient local inference
23023+
- Trained on diverse, high-quality text data
23024+
- Ideal for applications like chatbots, code generation, and reasoning
23025+
23026+
> ✅ *Note: The model is a distilled and quantized version. The original base model can be found at [OPTML-Group/R2MU-DeepSeek-R1-Distill-Llama-8B](https://huggingface.co/OPTML-Group/R2MU-DeepSeek-R1-Distill-Llama-8B) — this repository contains GGUF-quantized variants for local inference.*
23027+
23028+
**Use Case**: Excellent for lightweight, high-performance inference on laptops, edge devices, or local servers.
23029+
overrides:
23030+
parameters:
23031+
model: R2MU-DeepSeek-R1-Distill-Llama-8B.Q4_K_M.gguf
23032+
files:
23033+
- filename: R2MU-DeepSeek-R1-Distill-Llama-8B.Q4_K_M.gguf
23034+
sha256: c5104c70e1b43aa205e46d0e88c015541239d978d5c016825aefa3dc959cf568
23035+
uri: huggingface://mradermacher/R2MU-DeepSeek-R1-Distill-Llama-8B-GGUF/R2MU-DeepSeek-R1-Distill-Llama-8B.Q4_K_M.gguf

0 commit comments

Comments
 (0)