UPSTREAM PR #18876: mtmd : fix ASR for LFM2.5-Audio-1.5B#939
UPSTREAM PR #18876: mtmd : fix ASR for LFM2.5-Audio-1.5B#939
Conversation
|
Explore the complete analysis inside the Version Insights Performance Review ReportSummaryThis update introduces negligible performance impact with a single commit fixing ASR (Automatic Speech Recognition) for LFM2.5-Audio-1.5B model. The changes affect only non-critical logging and metadata utility functions in the mtmd (multimodal) library, with improvements under 50ns. Performance ImpactTwo functions in
Both functions operate outside inference hot paths—one handles diagnostic logging for InternVL vision models, the other converts GGUF metadata to strings during model loading. Power ConsumptionThe mtmd library shows a 0.233% increase in power consumption (178,378 → 178,793 nJ), representing a 415 nJ increase. This negligible change aligns with the addition of 37 new files for ASR functionality. All other binaries show zero power consumption change. Code ChangesThe commit adds ASR support for the LFM2.5-Audio-1.5B audio model with 37 new files and 2 modifications. The performance improvements in the two analyzed functions result from compiler-level optimizations rather than intentional code changes, likely due to build configuration updates accompanying the new ASR feature. |
d36a5a9 to
49ab1eb
Compare
|
@loci-dev What is the biggest difference in response time? |
|
Based on the performance review report, the biggest difference in response time is 46ns for the function clip_log_internal_v, which improved from 269ns to 223ns. This represents the largest absolute change among the functions analyzed in this update. The only other function with a response time change is gguf_data_to_str, which improved by 38ns (from 1802ns to 1764ns). Both of these are very small improvements in absolute terms - we're talking about nanoseconds - and they occurred in non-critical utility functions rather than in the main inference hot paths. |
d664a5a to
48924ee
Compare
6d56868 to
a50395f
Compare
Mirrored from ggml-org/llama.cpp#18876
The callback was renaming the input tensor and leading to error
The first commit causing the issue is ggml-org/llama.cpp#17914.