Skip to content

llama : disable Direct IO by default#19109

Merged
ggerganov merged 2 commits intomasterfrom
gg/llama-dio-off
Jan 28, 2026
Merged

llama : disable Direct IO by default#19109
ggerganov merged 2 commits intomasterfrom
gg/llama-dio-off

Conversation

@ggerganov
Copy link
Member

@ggerganov ggerganov commented Jan 26, 2026

ref #19035 (comment)
cont #18012

  • Update llama_model_params::use_direct_io == false by default
  • Update common_params::use_direct_io == false by default

@CISC
Copy link
Member

CISC commented Jan 26, 2026

Perhaps makes sense to just disable mmap when --direct-io is used (and available) as mmap is on by default, a bit silly to have to also use --no-mmap?

@ggerganov ggerganov marked this pull request as ready for review January 27, 2026 16:28
@ggerganov ggerganov requested a review from CISC as a code owner January 27, 2026 16:28
@ggerganov
Copy link
Member Author

@CISC I've updated the logic as recommended.

The CUDA/Vulkan CI runs are a bit faster now:

Screenshot 2026-01-27 at 6 27 13 PM

And on master:

Screenshot 2026-01-27 at 6 27 58 PM

@ggerganov ggerganov merged commit c5c64f7 into master Jan 28, 2026
78 checks passed
@ggerganov ggerganov deleted the gg/llama-dio-off branch January 28, 2026 07:11
shaofeiqi pushed a commit to qualcomm/llama.cpp that referenced this pull request Feb 6, 2026
* llama : disable Direct IO by default

* cont : override mmap if supported
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants