use weights_only in conversion script to prevent model arbitrary code execution by deepdiffuser · Pull Request #32 · ggml-org/llama.cpp

deepdiffuser · 2023-03-12T04:11:54Z

this restricts malicious weights from executing arbitrary code by restricting the unpickler to only loading tensors, primitive types, and dictionaries.

see torch.load docs

https://pytorch.org/docs/stable/generated/torch.load.html

i tested this and it seems to work the same as before

this restricts malicious weights from executing arbitrary code by restricting the unpickler to only loading tensors, primitive types, and dictionaries

wizzard0 · 2023-03-12T10:30:29Z

@deepdiffuser I do support this change, but now I get

TypeError: 'weights_only' is an invalid keyword argument for Unpickler()

Any ideas?

deepdiffuser · 2023-03-12T10:40:28Z

what version of pytorch? I believe you need 1.13.1 for this arg

wizzard0 · 2023-03-12T10:48:44Z

Ah, I see. conda gets you 1.12.1. Let's keep this thread for posterity.

Cuda performance broadcast

Cherry pick of ggml-org#32 Co-authored-by: Yuri Khrustalev <ykhrustalev@users.noreply.github.com>

* Zen4 flash attention: moving useful parts from the kq_fused_softmax branch * Add flash attention with soft-cap and fix D = 256 case * Flash attention refinements * Update FlashAttn comment --------- Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>

…n data Part of ggml-org#32: turbo3 prefill degrades relative to q8_0 with context length. Changes so far: - Skip ggml_cont when tensors already contiguous (+1%, minimal) - Generated 32x32 rotation matrices (turbo-rotation-data-32.h) for reduced group size approach (16x less matmul compute) - Fixed V un-rotation to check v->type not k->type Next: update QK_TURBO3_GROUP, Metal WHT kernel, and KV cache for d=32. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-Authored-By: tturney@psyguard.ai

use weights_only in conversion script

1ed5c7c

this restricts malicious weights from executing arbitrary code by restricting the unpickler to only loading tensors, primitive types, and dictionaries

ggerganov merged commit a931202 into ggml-org:master Mar 12, 2023

wizzard0 mentioned this pull request Mar 12, 2023

add python/pytorch version compat notes #44

Merged

flowgrad pushed a commit to flowgrad/llama.cpp that referenced this pull request Jun 27, 2023

Merge pull request ggml-org#32 from cmp-nct/cuda-performance-broadcast

4ca3961

Cuda performance broadcast

Bearsaerker mentioned this pull request Mar 12, 2025

Eval bug: Gemma 3 extremly slow prompt processing when using quantized kv cache. #12352

Closed

uttampc1 mentioned this pull request Nov 18, 2025

Throughput improvement for small batch sizes #17342

Open

Alcpz added a commit to Alcpz/llama.cpp that referenced this pull request Dec 1, 2025

Add dynamic ggml-cpu backend loading (ggml-org#42)

83596d9

Cherry pick of ggml-org#32 Co-authored-by: Yuri Khrustalev <ykhrustalev@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

use weights_only in conversion script to prevent model arbitrary code execution#32

use weights_only in conversion script to prevent model arbitrary code execution#32
ggerganov merged 1 commit intoggml-org:masterfrom
deepdiffuser:master

deepdiffuser commented Mar 12, 2023

Uh oh!

wizzard0 commented Mar 12, 2023

Uh oh!

deepdiffuser commented Mar 12, 2023

Uh oh!

wizzard0 commented Mar 12, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

deepdiffuser commented Mar 12, 2023

Uh oh!

wizzard0 commented Mar 12, 2023

Uh oh!

deepdiffuser commented Mar 12, 2023

Uh oh!

wizzard0 commented Mar 12, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants