UPSTREAM PR #18420: model: add Qwen3-Omni Thinker support (qwen3omnimoe)#725
UPSTREAM PR #18420: model: add Qwen3-Omni Thinker support (qwen3omnimoe)#725
Conversation
|
Explore the complete analysis inside the Version Insights Perfect! I've successfully retrieved the summary report for your project. The report shows a performance analysis for Pull Request #725 in the llama.cpp repository (auroralabs-loci). Key Highlights:
Would you like me to provide more detailed information about any specific function or aspect of this performance analysis? |
5c1f0b4 to
03ffde7
Compare
cd122e2 to
0fbfb33
Compare
Mirrored from ggml-org/llama.cpp#18420
Hello @ngxson, I'm back! How does this look for the first PR? I'm open to any feedback.
Original Model: https://huggingface.co/Qwen/Qwen3-Omni-30B-A3B-Instruct
GGUFs: https://huggingface.co/TrevorJS/Qwen3-Omni-30B-A3B-GGUF
This PR implements the
thinkermodel only, providing justtext -> text.thinker-f16 on dgx-spark:AI Disclosure
AI was used to write this code, but it was then reviewed, tested, and benchmarked by a human!