|
23006 | 23006 | - filename: nvidia.Qwen3-Nemotron-32B-RLBFF.Q4_K_M.gguf |
23007 | 23007 | sha256: 5dfc9f1dc21885371b12a6e0857d86d6deb62b6601b4d439e4dfe01195a462f1 |
23008 | 23008 | uri: huggingface://DevQuasar/nvidia.Qwen3-Nemotron-32B-RLBFF-GGUF/nvidia.Qwen3-Nemotron-32B-RLBFF.Q4_K_M.gguf |
| 23009 | +- !!merge <<: *deepseek-r1 |
| 23010 | + name: "r2mu-deepseek-r1-distill-llama-8b" |
| 23011 | + urls: |
| 23012 | + - https://huggingface.co/mradermacher/R2MU-DeepSeek-R1-Distill-Llama-8B-GGUF |
| 23013 | + description: | |
| 23014 | + The **R2MU-DeepSeek-R1-Distill-Llama-8B** is a distilled 8-billion-parameter language model based on the DeepSeek-R1 architecture, fine-tuned and optimized by the OPTML-Group. It leverages distillation techniques to retain strong performance while reducing size and inference cost, making it efficient for deployment on consumer hardware. The model is built upon the Llama-2 family of models and is trained on a broad range of text data to support diverse language tasks including reasoning, coding, and general conversation. |
| 23015 | + |
| 23016 | + This particular version is available in multiple quantized formats (e.g., Q4_K_S, Q5_K_M, Q8_0) via the GGUF format, enabling efficient inference using tools like `llama.cpp`. The model is designed for use in local, low-resource environments without sacrificing too much accuracy. |
| 23017 | + |
| 23018 | + **Key Features:** |
| 23019 | + - Base model: DeepSeek-R1-Distill-Llama-8B (originally by OPTML-Group) |
| 23020 | + - Architecture: Llama-style transformer, 8B parameters |
| 23021 | + - Distilled from a larger model to balance speed and performance |
| 23022 | + - Quantized versions available for efficient local inference |
| 23023 | + - Trained on diverse, high-quality text data |
| 23024 | + - Ideal for applications like chatbots, code generation, and reasoning |
| 23025 | + |
| 23026 | + > ✅ *Note: The model is a distilled and quantized version. The original base model can be found at [OPTML-Group/R2MU-DeepSeek-R1-Distill-Llama-8B](https://huggingface.co/OPTML-Group/R2MU-DeepSeek-R1-Distill-Llama-8B) — this repository contains GGUF-quantized variants for local inference.* |
| 23027 | + |
| 23028 | + **Use Case**: Excellent for lightweight, high-performance inference on laptops, edge devices, or local servers. |
| 23029 | + overrides: |
| 23030 | + parameters: |
| 23031 | + model: R2MU-DeepSeek-R1-Distill-Llama-8B.Q4_K_M.gguf |
| 23032 | + files: |
| 23033 | + - filename: R2MU-DeepSeek-R1-Distill-Llama-8B.Q4_K_M.gguf |
| 23034 | + sha256: c5104c70e1b43aa205e46d0e88c015541239d978d5c016825aefa3dc959cf568 |
| 23035 | + uri: huggingface://mradermacher/R2MU-DeepSeek-R1-Distill-Llama-8B-GGUF/R2MU-DeepSeek-R1-Distill-Llama-8B.Q4_K_M.gguf |
0 commit comments