UPSTREAM PR #17898: model : Qwen3-Next-80B-A3B has 48 layers by loci-dev · Pull Request #506 · auroralabs-loci/llama.cpp

loci-dev · 2025-12-10T02:15:08Z

Qwen3-Next-80B-A3B has 48 layers instead of 80, as pointed out by model README and a comment in original PR.

This change should be purely cosmetic, fixes "?B" model names shown by llama-bench, etc.

loci-review · 2025-12-10T03:39:36Z

Explore the complete analysis inside the Version Insights

Pull Request #506 Technical Review

PR Summary

Title: UPSTREAM PR #17898: model : Qwen3-Next-80B-A3B has 48 layers
Changes: Corrects layer count for Qwen3-Next-80B-A3B model from 80 to 48 layers and adds missing type name string mapping.

Code Changes Analysis

Modified File: src/llama-model.cpp

Change 1 - llm_type_name() function (line 123):

Added case statement: case LLM_TYPE_80B_A3B: return "80B.A3B";
Purpose: Provides string representation for the 80B_A3B model type enum
Impact: Enables proper model name display in llama-bench and other tools

Change 2 - llama_model::load_hparams() function (line 2261):

Modified layer count check: case 80: → case 48:
Purpose: Corrects model architecture detection for Qwen3-Next-80B-A3B
Impact: Ensures correct model type assignment during model loading based on actual layer count

Performance Impact Assessment

Function: llm_type_name()

Base response time: 57 ns
Current response time: 62 ns
Absolute change: +5 ns
Analysis: The addition of one case statement in the switch block adds minimal overhead. The 5 ns increase represents a single additional comparison in the switch dispatch logic.

Function: llama_model::load_hparams()

This function is part of model loading, not inference path
Changes affect model initialization only, executed once per model load
No impact on per-token inference performance

Inference Performance:
No functions in the inference path (llama_decode, llama_encode, llama_tokenize) were modified. The changes are isolated to model metadata handling and initialization logic. Tokens per second remains unaffected.

Power Consumption:

Binary: build.bin.libllama.so
Change: +0.018% (+36 nJ)
Analysis: Negligible increase consistent with one additional switch case

The changes are cosmetic corrections to model metadata with no measurable impact on inference performance or throughput.

EZForever added 2 commits December 10, 2025 09:47

model : Qwen3-Next-80B-A3B has 48 layers

3ac4c8e

model : Add 80B-A3B type name

6e1fa6f

loci-dev temporarily deployed to PROD__AL_DEMO December 10, 2025 02:15 — with GitHub Actions Inactive

loci-dev force-pushed the main branch 26 times, most recently from de9b0c0 to b28744d Compare December 13, 2025 10:08

loci-dev force-pushed the main branch 30 times, most recently from 6c677ac to c39aef9 Compare December 18, 2025 07:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

UPSTREAM PR #17898: model : Qwen3-Next-80B-A3B has 48 layers#506

UPSTREAM PR #17898: model : Qwen3-Next-80B-A3B has 48 layers#506
loci-dev wants to merge 2 commits intomainfrom
upstream-PR17898-branch_EZForever-model-qwen3next-layers

loci-dev commented Dec 10, 2025

Uh oh!

loci-review bot commented Dec 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

loci-dev commented Dec 10, 2025

Uh oh!

loci-review bot commented Dec 10, 2025

Pull Request #506 Technical Review

PR Summary

Code Changes Analysis

Performance Impact Assessment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants