UPSTREAM PR #17279: convert : set expert gating func in base class by DajanaV · Pull Request #217 · auroralabs-loci/llama.cpp

DajanaV · 2025-11-14T23:34:18Z

Move add_expert_gating_func call to base class, no point in duplicating this.

Also fixes conversion failure for dots1 since the following fixes to the model:

loci-review · 2025-11-15T00:19:22Z

Access the complete analysis in the LOCI Dashboard

Performance Analysis Summary

Overview

Pull Request #217 implements a code refactoring to centralize expert gating function configuration in the base TextModel class, eliminating duplicate implementations across multiple model classes. The changes affect the Python model conversion script (convert_hf_to_gguf.py) rather than the core C++ inference engine.

Performance Impact Assessment

The analysis identified minimal performance variations in unrelated functions:

Highest Response Time Change: linenoiseSetCompletionCallback (+0.076%, +0.011 ns absolute)
Highest Throughput Change: make_unique<llm_graph_input_attn_no_cache> (+0.112%, +0.078 ns absolute)
Power Consumption: 0.0% change across all binaries

These performance changes are unrelated to the PR modifications and represent normal compilation variance rather than functional impacts.

Code Analysis

The refactoring consolidates expert gating function logic by:

Adding centralized parameter detection for ["score_function", "scoring_func", "score_func"]
Removing 25+ lines of duplicate code across 5 model classes
Standardizing error handling for unsupported gating functions
Fixing conversion failures for the dots1 model

Key Findings

Core Function Impact: None. The changes affect only the model conversion pipeline, not the critical inference functions (llama_decode, llama_encode, llama_tokenize) that determine tokens-per-second performance.

Performance Metrics: All detected changes fall within measurement noise levels (<1 ns absolute change). The functions showing performance variations (linenoiseSetCompletionCallback, template instantiation) are unrelated to the code modifications.

Power Consumption: No measurable impact across any binary components, confirming the changes don't affect runtime execution efficiency.

Code Quality: The refactoring improves maintainability by eliminating code duplication and providing consistent parameter handling across model classes.

Critical Issues: None identified. The implementation maintains backward compatibility while fixing conversion issues for specific model types.

The changes represent a positive code quality improvement with no meaningful performance impact on the inference engine.

set expert gating func in base class

7e56120

DajanaV temporarily deployed to PROD__AL_DEMO November 14, 2025 23:34 — with GitHub Actions Inactive

DajanaV force-pushed the main branch 25 times, most recently from f333350 to 9c4623f Compare November 18, 2025 09:10

loci-dev force-pushed the main branch 2 times, most recently from 64f477c to 7c4fc52 Compare November 20, 2025 11:08

loci-dev force-pushed the main branch 30 times, most recently from 409b78f to b789b13 Compare November 27, 2025 00:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

UPSTREAM PR #17279: convert : set expert gating func in base class#217

UPSTREAM PR #17279: convert : set expert gating func in base class#217
DajanaV wants to merge 1 commit intomainfrom
upstream-PR17279-branch_ggml-org-cisc/convert-common-expert-gating-func

DajanaV commented Nov 14, 2025

Uh oh!

loci-review bot commented Nov 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

DajanaV commented Nov 14, 2025

Uh oh!

loci-review bot commented Nov 15, 2025

Performance Analysis Summary

Overview

Performance Impact Assessment

Code Analysis

Key Findings

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants