UPSTREAM PR #17945: models : fix the attn_factor for mistral3 graphs#526
UPSTREAM PR #17945: models : fix the attn_factor for mistral3 graphs#526
Conversation
|
Explore the complete analysis inside the Version Insights Performance Analysis Summary: PR #526OverviewThis PR implements a correctness fix for RoPE attention factor calculation in Mistral3 models. The changes remove 12 lines from Key FindingsCode Changes Impact: Performance-Critical Functions: Inference Impact: Power Consumption: Conclusion: |
45e0e28 to
e9472cd
Compare
9f1f66d to
ec69147
Compare
Mirrored from ggml-org/llama.cpp#17945
cont #17644
Fix the adjustment of the RoPE attention factor based on:
https://github.com/huggingface/transformers/blob/6d00f6b0a5679c36510f203e4226e36f517c3032/src/transformers/modeling_rope_utils.py#L336-L348