Skip to content

vulkan: fix non-contig rope#19299

Merged
0cc4m merged 1 commit intoggml-org:masterfrom
jeffbolznv:rope_noncontig
Feb 5, 2026
Merged

vulkan: fix non-contig rope#19299
0cc4m merged 1 commit intoggml-org:masterfrom
jeffbolznv:rope_noncontig

Conversation

@jeffbolznv
Copy link
Collaborator

For #19296.

@jeffbolznv jeffbolznv requested a review from 0cc4m as a code owner February 3, 2026 18:23
@github-actions github-actions bot added Vulkan Issues specific to the Vulkan backend ggml changes relating to the ggml tensor library for machine learning labels Feb 3, 2026
Copy link
Collaborator

@LostRuins LostRuins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can confirm it fixes #19292 for me!

Works fine and fixes the incoherence-after-shifting regression on GLM-4-32B-0414 and GLM-4.7-Flash.

Much appreciated.

ORippler added a commit to ORippler/llama.cpp that referenced this pull request Feb 4, 2026
Seems memory layout is shared with Vulkan so we can port fix from
ggml-org#19299
@0cc4m 0cc4m merged commit c342c3b into ggml-org:master Feb 5, 2026
78 checks passed
ORippler added a commit to ORippler/llama.cpp that referenced this pull request Feb 6, 2026
Seems memory layout is shared with Vulkan so we can port fix from
ggml-org#19299
ggerganov added a commit that referenced this pull request Feb 8, 2026
* Rename variables + fix rope_neox

Seems memory layout is shared with Vulkan so we can port fix from
#19299

* Fix rope_multi

* Fix rope_vision

* Fix rope_norm

* Rename ne* to ne0* for consistent variable naming

* cont : consistent stride names

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
maxious added a commit to maxious/llama.cpp that referenced this pull request Feb 8, 2026
This fix adds support for non-contiguous (strided) source tensors and
proper destination stride handling for inplace operations.

Changes:
- Add s3 (dimension 3 stride) parameter to all ROPE kernels
- Add d1, d2, d3 (destination strides) parameters to all ROPE kernels
- Properly decompose 4D coordinates (i1, i2, i3) from row_dst
- Use strided indexing (i3*s3 + i2*s2 + i1*s1 + i0) for source access
- Use destination strides (i3*d3 + i2*d2 + i1*d1 + i0) for destination access
- Fix pos array indexing to use i2 (actual dimension 2 index)

This aligns the SYCL implementation with the Vulkan fix in PR ggml-org#19299
and enables proper support for KV cache shift operations which use
non-contiguous tensor views.

All 288 ROPE tests now pass including:
- Non-contiguous (v=1) tests
- Inplace (inplace=1) tests
- Combined non-contiguous + inplace tests
- All modes: NORM (0), NEOX (2), MROPE (8), VISION (24), etc.
@CISC CISC mentioned this pull request Feb 8, 2026
maxious added a commit to maxious/llama.cpp that referenced this pull request Feb 14, 2026
This fix adds support for non-contiguous (strided) source tensors and
proper destination stride handling for inplace operations.

Changes:
- Add s3 (dimension 3 stride) parameter to all ROPE kernels
- Add d1, d2, d3 (destination strides) parameters to all ROPE kernels
- Properly decompose 4D coordinates (i1, i2, i3) from row_dst
- Use strided indexing (i3*s3 + i2*s2 + i1*s1 + i0) for source access
- Use destination strides (i3*d3 + i2*d2 + i1*d1 + i0) for destination access
- Fix pos array indexing to use i2 (actual dimension 2 index)

This aligns the SYCL implementation with the Vulkan fix in PR ggml-org#19299
and enables proper support for KV cache shift operations which use
non-contiguous tensor views.

All 288 ROPE tests now pass including:
- Non-contiguous (v=1) tests
- Inplace (inplace=1) tests
- Combined non-contiguous + inplace tests
- All modes: NORM (0), NEOX (2), MROPE (8), VISION (24), etc.
ggerganov added a commit to ggml-org/ggml that referenced this pull request Feb 14, 2026
* Rename variables + fix rope_neox

Seems memory layout is shared with Vulkan so we can port fix from
ggml-org/llama.cpp#19299

* Fix rope_multi

* Fix rope_vision

* Fix rope_norm

* Rename ne* to ne0* for consistent variable naming

* cont : consistent stride names

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
ggerganov added a commit to ggml-org/ggml that referenced this pull request Feb 14, 2026
* Rename variables + fix rope_neox

Seems memory layout is shared with Vulkan so we can port fix from
ggml-org/llama.cpp#19299

* Fix rope_multi

* Fix rope_vision

* Fix rope_norm

* Rename ne* to ne0* for consistent variable naming

* cont : consistent stride names

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
ggerganov added a commit to ggml-org/whisper.cpp that referenced this pull request Feb 15, 2026
* Rename variables + fix rope_neox

Seems memory layout is shared with Vulkan so we can port fix from
ggml-org/llama.cpp#19299

* Fix rope_multi

* Fix rope_vision

* Fix rope_norm

* Rename ne* to ne0* for consistent variable naming

* cont : consistent stride names

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
ggerganov added a commit to ggml-org/whisper.cpp that referenced this pull request Feb 15, 2026
* Rename variables + fix rope_neox

Seems memory layout is shared with Vulkan so we can port fix from
ggml-org/llama.cpp#19299

* Fix rope_multi

* Fix rope_vision

* Fix rope_norm

* Rename ne* to ne0* for consistent variable naming

* cont : consistent stride names

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
liparetejas pushed a commit to liparetejas/llama.cpp that referenced this pull request Feb 23, 2026
liparetejas pushed a commit to liparetejas/llama.cpp that referenced this pull request Feb 23, 2026
* Rename variables + fix rope_neox

Seems memory layout is shared with Vulkan so we can port fix from
ggml-org#19299

* Fix rope_multi

* Fix rope_vision

* Fix rope_norm

* Rename ne* to ne0* for consistent variable naming

* cont : consistent stride names

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
maxious added a commit to maxious/llama.cpp that referenced this pull request Mar 2, 2026
This fix adds support for non-contiguous (strided) source tensors and
proper destination stride handling for inplace operations.

Changes:
- Add s3 (dimension 3 stride) parameter to all ROPE kernels
- Add d1, d2, d3 (destination strides) parameters to all ROPE kernels
- Properly decompose 4D coordinates (i1, i2, i3) from row_dst
- Use strided indexing (i3*s3 + i2*s2 + i1*s1 + i0) for source access
- Use destination strides (i3*d3 + i2*d2 + i1*d1 + i0) for destination access
- Fix pos array indexing to use i2 (actual dimension 2 index)

This aligns the SYCL implementation with the Vulkan fix in PR ggml-org#19299
and enables proper support for KV cache shift operations which use
non-contiguous tensor views.

All 288 ROPE tests now pass including:
- Non-contiguous (v=1) tests
- Inplace (inplace=1) tests
- Combined non-contiguous + inplace tests
- All modes: NORM (0), NEOX (2), MROPE (8), VISION (24), etc.
bartowski1182 pushed a commit to bartowski1182/llama.cpp that referenced this pull request Mar 2, 2026
* Rename variables + fix rope_neox

Seems memory layout is shared with Vulkan so we can port fix from
ggml-org#19299

* Fix rope_multi

* Fix rope_vision

* Fix rope_norm

* Rename ne* to ne0* for consistent variable naming

* cont : consistent stride names

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ggml changes relating to the ggml tensor library for machine learning Vulkan Issues specific to the Vulkan backend

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants