UPSTREAM PR #18291: gguf-py : do not align the data start offset by loci-dev · Pull Request #664 · auroralabs-loci/llama.cpp

loci-dev · 2025-12-22T15:37:32Z

The safetensors format doesn't require alignment. Fixes: #18282 (which was a regression caused by #15667).

I assumed wrong since GGUF does align its data offset, and the writer for safetensors aligns to 8 bytes (see https://github.com/huggingface/safetensors/blob/806426784adb43631e9a1102d4621126bb589347/safetensors/src/tensor.rs#L256-L258), and also because the data offset alignment was implemented in the same way in #12820. But apparently some models aren't aligned.

It seems like PyTorch and Numpy can handle unaligned tensors, but I'm not completely sure (is it only for shape transformations, or does it also support arithmetic on unaligned tensors? (would need an unaligned model which has some arithmetic in its modify_tensors transformations to test this)). Copying the tensor (with e.g. data.copy()) wouldn't necessarily always be sufficient, because that doesn't seem to align to 8 bytes when the dtype is np.uint8. I'll try to figure out how to make an aligned copy. But if it's not really necessary in practice, then this is ready.

EDIT: I've looked at the .data_ptr() addresses when using the safetensors library with an unaligned model, and it doesn't make an aligned copy (at least when using get_slice like since #8482). So the new behavior is pretty much the same as with the safetensors library.

Tested on https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B

Thanks @fairydreaming for finding this problem! (and finding the rationale behind why unaligned safetensors exist)

Make sure to read the contributing guidelines before submitting a PR

The safetensors format doesn't require alignment.

loci-review · 2025-12-23T09:58:25Z

Explore the complete analysis inside the Version Insights

Performance Analysis Summary - PR #664

Analysis Scope: Pull request #664 modifies gguf-py/gguf/utility.py, removing 8-byte alignment logic from safetensors file parsing in Python utility code.

Performance Impact: Zero measurable impact on inference performance. Analysis confirms no changes in response time, throughput, or power consumption across all 16 analyzed binaries. The modification affects model loading utilities written in Python, not the C++ inference engine.

Code Change Nature: This is a correctness fix that removes incorrect alignment assumptions when parsing safetensors files. The change enables loading of unaligned safetensors models (e.g., DeepSeek-R1-Distill-Qwen-1.5B) that previously failed. The removed code performed unnecessary offset calculations during one-time model loading operations.

Inference Performance: No impact on tokens per second. Functions responsible for inference (llama_decode, llama_encode, llama_tokenize) show zero change in response time and throughput. Power consumption remains identical across all binaries including build.bin.libllama.so (186129 nJ), build.bin.libggml-cpu.so (119986 nJ), and build.bin.llama-run (223113 nJ).

Conclusion: The change fixes model loading compatibility without affecting runtime performance. All performance-critical inference paths remain unchanged.

gguf-py : do not align the data start offset

5f14aa8

The safetensors format doesn't require alignment.

loci-dev had a problem deploying to PROD__AL_DEMO December 22, 2025 15:37 — with GitHub Actions Error

loci-dev force-pushed the main branch 8 times, most recently from 43ae401 to 37b9287 Compare December 23, 2025 08:12

loci-dev temporarily deployed to PROD__AL_DEMO December 23, 2025 09:10 — with GitHub Actions Inactive

loci-dev force-pushed the main branch 18 times, most recently from 3f5c44f to 49ab457 Compare December 25, 2025 18:11

loci-dev force-pushed the main branch 30 times, most recently from 68d2c99 to 410c086 Compare January 1, 2026 02:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

UPSTREAM PR #18291: gguf-py : do not align the data start offset#664

UPSTREAM PR #18291: gguf-py : do not align the data start offset#664
loci-dev wants to merge 1 commit intomainfrom
upstream-PR18291-branch_ggml-org-compilade/fix-safetensors-unaligned

loci-dev commented Dec 22, 2025

Uh oh!

loci-review bot commented Dec 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

loci-dev commented Dec 22, 2025

Uh oh!

loci-review bot commented Dec 23, 2025

Performance Analysis Summary - PR #664

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants