Skip to content

UPSTREAM PR #18291: gguf-py : do not align the data start offset#664

Open
loci-dev wants to merge 1 commit intomainfrom
upstream-PR18291-branch_ggml-org-compilade/fix-safetensors-unaligned
Open

UPSTREAM PR #18291: gguf-py : do not align the data start offset#664
loci-dev wants to merge 1 commit intomainfrom
upstream-PR18291-branch_ggml-org-compilade/fix-safetensors-unaligned

Conversation

@loci-dev
Copy link
Copy Markdown

Mirrored from ggml-org/llama.cpp#18291

The safetensors format doesn't require alignment. Fixes: #18282 (which was a regression caused by #15667).

I assumed wrong since GGUF does align its data offset, and the writer for safetensors aligns to 8 bytes (see https://github.com/huggingface/safetensors/blob/806426784adb43631e9a1102d4621126bb589347/safetensors/src/tensor.rs#L256-L258), and also because the data offset alignment was implemented in the same way in #12820. But apparently some models aren't aligned.

It seems like PyTorch and Numpy can handle unaligned tensors, but I'm not completely sure (is it only for shape transformations, or does it also support arithmetic on unaligned tensors? (would need an unaligned model which has some arithmetic in its modify_tensors transformations to test this)). Copying the tensor (with e.g. data.copy()) wouldn't necessarily always be sufficient, because that doesn't seem to align to 8 bytes when the dtype is np.uint8. I'll try to figure out how to make an aligned copy. But if it's not really necessary in practice, then this is ready.

EDIT: I've looked at the .data_ptr() addresses when using the safetensors library with an unaligned model, and it doesn't make an aligned copy (at least when using get_slice like since #8482). So the new behavior is pretty much the same as with the safetensors library.

Tested on https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B

Thanks @fairydreaming for finding this problem! (and finding the rationale behind why unaligned safetensors exist)

Make sure to read the contributing guidelines before submitting a PR

The safetensors format doesn't require alignment.
@loci-review
Copy link
Copy Markdown

loci-review bot commented Dec 23, 2025

Explore the complete analysis inside the Version Insights

Performance Analysis Summary - PR #664

Analysis Scope: Pull request #664 modifies gguf-py/gguf/utility.py, removing 8-byte alignment logic from safetensors file parsing in Python utility code.

Performance Impact: Zero measurable impact on inference performance. Analysis confirms no changes in response time, throughput, or power consumption across all 16 analyzed binaries. The modification affects model loading utilities written in Python, not the C++ inference engine.

Code Change Nature: This is a correctness fix that removes incorrect alignment assumptions when parsing safetensors files. The change enables loading of unaligned safetensors models (e.g., DeepSeek-R1-Distill-Qwen-1.5B) that previously failed. The removed code performed unnecessary offset calculations during one-time model loading operations.

Inference Performance: No impact on tokens per second. Functions responsible for inference (llama_decode, llama_encode, llama_tokenize) show zero change in response time and throughput. Power consumption remains identical across all binaries including build.bin.libllama.so (186129 nJ), build.bin.libggml-cpu.so (119986 nJ), and build.bin.llama-run (223113 nJ).

Conclusion: The change fixes model loading compatibility without affecting runtime performance. All performance-critical inference paths remain unchanged.

@loci-dev loci-dev force-pushed the main branch 18 times, most recently from 3f5c44f to 49ab457 Compare December 25, 2025 18:11
@loci-dev loci-dev force-pushed the main branch 30 times, most recently from 68d2c99 to 410c086 Compare January 1, 2026 02:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants