Skip to content

UPSTREAM PR #18105: model: fix LFM2 missing tensors#594

Open
loci-dev wants to merge 1 commit intomainfrom
upstream-PR18105-branch_ngxson-xsn/lfm2_missing_tensor
Open

UPSTREAM PR #18105: model: fix LFM2 missing tensors#594
loci-dev wants to merge 1 commit intomainfrom
upstream-PR18105-branch_ngxson-xsn/lfm2_missing_tensor

Conversation

@loci-dev
Copy link
Copy Markdown

Mirrored from ggml-org/llama.cpp#18105

Overlooked a change from ggml-org/llama.cpp#18051

cc @tdakhran for visibility: LFM2 reports missing tensor without this fix

@loci-review
Copy link
Copy Markdown

loci-review bot commented Dec 16, 2025

Explore the complete analysis inside the Version Insights

Performance Analysis Summary: PR #594

Pull Request: LFM2 Missing Tensors Fix
Change Scope: Single file modification (src/llama-model.cpp), 2 lines changed
Nature: Tensor naming correction for LFM2 model architecture


Analysis Overview

This PR addresses a missing tensor issue in the LFM2 model by correcting the tensor name used during model loading. The change modifies line 6239 in llama_model::load_tensors(), replacing LLM_TENSOR_OUTPUT_NORM with LLM_TENSOR_OUTPUT_NORM_LFM2 for the output normalization layer.

Performance Impact

Power Consumption Analysis:
All 16 analyzed binaries show zero or negligible power consumption change:

  • libllama.so: -0.0001% (-0.26 nJ)
  • llama-tts: +0.0004% (+0.95 nJ)
  • llama-cvector-generator: +0.0002% (+0.62 nJ)
  • All other binaries: 0.0% change

Function-Level Metrics:
No measurable performance deltas detected across any functions. The summary report returned no data for Response Time or Throughput Time changes, indicating the versions are functionally identical from a performance perspective.

Inference Impact:
No impact on tokens per second. The change affects only model loading logic for LFM2 architecture, specifically the tensor name lookup during weight initialization. Functions responsible for inference (llama_decode, llama_encode, llama_tokenize) show no performance variation.

Technical Assessment

The modification is a correctness fix rather than a performance optimization. It ensures the LFM2 model loader references the architecture-specific output normalization tensor name, preventing "missing tensor" errors during model initialization. This change has no runtime performance implications as it only affects the one-time model loading phase, not the inference execution path.

@loci-dev loci-dev force-pushed the main branch 27 times, most recently from 2e88b20 to e02e9be Compare December 19, 2025 08:12
@loci-dev loci-dev force-pushed the main branch 30 times, most recently from 15838f1 to 006b713 Compare December 24, 2025 23:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants