Skip to content

models : fix graph splits#19866

Merged
ggerganov merged 1 commit intomasterfrom
gg/qwem35-fix-graph-splits
Feb 24, 2026
Merged

models : fix graph splits#19866
ggerganov merged 1 commit intomasterfrom
gg/qwem35-fix-graph-splits

Conversation

@ggerganov
Copy link
Member

@ggerganov ggerganov commented Feb 24, 2026

fix #19860
fix #19864

Ensure the node order of Qwen 3.5 graphs is suitable for multi-GPU systems.

@jacekpoplawski
Copy link
Contributor

with this code 27B no longer crashes for me

@github-actions github-actions bot added the model Model specific label Feb 24, 2026
@mukhma0c
Copy link

i had issues running qwen3.5-27b split across 2 gpus where it would crash before it generates anything. this pr fixed it
i am running linux ubuntu
intel i7 10700f
nvidia rtx 3090
nvidia rtx 3070
i was running the model using llama-server with the -ts 85,15 cli argument to split the model across both gpus and it was crashing before this pr. now it runs fine with PP over 700t/s and tg over 20t/s

@ggerganov ggerganov merged commit 2446419 into master Feb 24, 2026
75 checks passed
@ggerganov ggerganov deleted the gg/qwem35-fix-graph-splits branch February 24, 2026 22:01
bartowski1182 pushed a commit to bartowski1182/llama.cpp that referenced this pull request Mar 2, 2026
ArberSephirotheca pushed a commit to ArberSephirotheca/llama.cpp that referenced this pull request Mar 3, 2026
aldehir pushed a commit to aldehir/llama.cpp that referenced this pull request Mar 6, 2026
Ethan-a2 pushed a commit to Ethan-a2/llama.cpp that referenced this pull request Mar 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

model Model specific

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Eval bug: qwen35 and qwen35moe graph split issues (Severe PP impact, crashes) Eval bug: CUDA error on Qwen3.5-27B

3 participants