tests : update for LLAMA_SET_ROWS=1 by ggerganov · Pull Request #14961 · ggml-org/llama.cpp

ggerganov · 2025-07-30T07:39:20Z

target #14960

Extract the test updates from #14959 in a separate PR to be merged before enabling LLAMA_SET_ROWS=1 by default.

Test updates:

test-thread-safety
- limit the number of CPU threads per context to avoid hanging the system in some cases
- set cparams.n_seq_max = 1
embedding
- default to unified KV cache if -np is not specified
save-load-state
- default to unified KV cache if -np is not specified

ggml-ci

ggerganov · 2025-07-30T10:59:45Z

tests/test-thread-safety.cpp

+    // each context has a single sequence
+    cparams.n_seq_max = 1;
+
+    // prevent from launching too many threads
+    cparams.n_threads = std::min<int>(std::max(2u, std::thread::hardware_concurrency()/params.n_parallel), cparams.n_threads);
+


@slaren Small change to the test to make it compatible with split KV cache. Reduced the number of CPU threads because on the MacBook the process takes a long time (several minutes) to terminate (think it's some resource congestion when there are many threads started by the process, not sure).

This is a known issue with the thread pool implementation, using more threads than available will result in the threads spending more time spinning than doing work.

I am not convinced that it is good to ignore the parameters of the user to workaround what essentially is a bug. Can this be solved by running the test with -t 1?

Yes, -t 1 works. I was thinking to use -t 2 so we have context-level concurrency too. With -t 2 the test also runs cleanly on my devices.

ggml-ci

* test-thread-safety : each context uses a single sequence * embedding : handle --parallel argument ggml-ci * save-load : handle -np 1 ggml-ci * thread-safety : avoid overriding threads, reduce test case arg ggml-ci

ggerganov mentioned this pull request Jul 30, 2025

llama : enable LLAMA_SET_ROWS=1 by default #14959

Merged

github-actions bot added testing Everything test related examples labels Jul 30, 2025

Base automatically changed from gg/graph-fix-stack-use-after-return to master July 30, 2025 10:52

ggerganov added 3 commits July 30, 2025 13:53

test-thread-safety : each context uses a single sequence

07d4b29

embedding : handle --parallel argument

d90b20d

ggml-ci

save-load : handle -np 1

d6233d6

ggml-ci

ggerganov force-pushed the gg/tests-update-for-set-rows branch from e1ebdea to d6233d6 Compare July 30, 2025 10:53

ggerganov commented Jul 30, 2025

View reviewed changes

thread-safety : avoid overriding threads, reduce test case arg

4e4c6a7

ggml-ci

ggerganov requested a review from slaren July 30, 2025 11:46

slaren approved these changes Jul 30, 2025

View reviewed changes

ggerganov merged commit 00131d6 into master Jul 30, 2025
54 of 55 checks passed

ggerganov deleted the gg/tests-update-for-set-rows branch July 30, 2025 12:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tests : update for LLAMA_SET_ROWS=1#14961

tests : update for LLAMA_SET_ROWS=1#14961
ggerganov merged 4 commits intomasterfrom
gg/tests-update-for-set-rows

ggerganov commented Jul 30, 2025

Uh oh!

ggerganov Jul 30, 2025 •

edited

Loading

Uh oh!

slaren Jul 30, 2025

Uh oh!

slaren Jul 30, 2025

Uh oh!

ggerganov Jul 30, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ggerganov commented Jul 30, 2025

Uh oh!

ggerganov Jul 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

slaren Jul 30, 2025

Choose a reason for hiding this comment

Uh oh!

slaren Jul 30, 2025

Choose a reason for hiding this comment

Uh oh!

ggerganov Jul 30, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ggerganov Jul 30, 2025 •

edited

Loading