Conversation
…_logits This commit updates the embedding model verification script to use the CONVERTED_EMBEDDING_MODEL environment variable instead of using the EMBEDDING_MODEL_PATH (the original embedding model path) as the basis for the converted model file name. The motivation for this that currently if the converted embedding model file name differs from the original embedding model directory/name the verification script will look for the wrong .bin files that were generating when running the models.
|
Explore the complete analysis inside the Version Insights Performance Analysis Summary: PR #588Analysis Scope: Model conversion embedding verification script update This PR modifies a bash script used for embedding verification in the model conversion workflow. The change corrects file path resolution to use the converted model name instead of the original model name. No compiled binaries were modified. All performance metrics (response time, throughput, power consumption) remain unchanged at 0% across all 16 analyzed binaries including libllama.so, libggml-cpu.so, and llama-run. The script modification adds two variable assignments for path resolution with negligible execution overhead (sub-microsecond). No functions in performance-critical areas (llama_decode, llama_encode, ggml_compute_forward, ggml_mul_mat) were affected. Tokens per second for inference workloads remains unaffected. |
c785ce2 to
ab5b02c
Compare
c07a58e to
c71ff69
Compare
Mirrored from ggml-org/llama.cpp#18079
This commit updates the embedding model verification script to use the CONVERTED_EMBEDDING_MODEL environment variable instead of using the EMBEDDING_MODEL_PATH (the original embedding model path) as the basis for the converted model file name.
The motivation for this that currently if the converted embedding model file name differs from the original embedding model directory/name the verification script will look for the wrong .bin files that were generating when running the models.