Skip to content

Add batch offsets for mx.fast.rope#2564

Merged
awni merged 2 commits intomainfrom
batch_rope
Sep 9, 2025
Merged

Add batch offsets for mx.fast.rope#2564
awni merged 2 commits intomainfrom
batch_rope

Conversation

@awni
Copy link
Copy Markdown
Member

@awni awni commented Sep 3, 2025

This changes mx.fast.rope to accept a vector of offsets instead of just a single scalar. This is necessary for batch-decoding where the examples in the batch can be at different positions/ offsets.

@awni awni force-pushed the batch_rope branch 3 times, most recently from c076794 to aa21ddf Compare September 8, 2025 14:15
Copy link
Copy Markdown
Member

@angeloskath angeloskath left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great.

Nit / rant:

I guess it is inside the _impl because it is shared compared to freqs which is outside. The only reason I am mentioning this is because it felt a bit weird to complicate the _impl although it hasn't really changed (still one offset per call). Otoh I think carrying the other indices and all would be equally complicated.

@awni awni merged commit 17310d9 into main Sep 9, 2025
6 checks passed
@awni awni deleted the batch_rope branch September 9, 2025 00:35
@ZelinMa557
Copy link
Copy Markdown

May I ask, when can mlx users use this feature via pip install mlx? This feature is really important for batch infer engine, looking forward to use it!

@awni
Copy link
Copy Markdown
Member Author

awni commented Sep 9, 2025

We will aim to get a patch release out this week.

ekzhang added a commit to ekzhang/tiny-llm that referenced this pull request Sep 17, 2025
This test requires the latest version of mlx 0.29.1, since they just merged support for this in mlx a week ago: ml-explore/mlx#2564

I verified that the other tests still pass with the version upgrade.
skyzh pushed a commit to skyzh/tiny-llm that referenced this pull request Sep 17, 2025
This test requires the latest version of mlx 0.29.1, since they just merged support for this in mlx a week ago: ml-explore/mlx#2564

I verified that the other tests still pass with the version upgrade.
faisalmemon pushed a commit to faisalmemon/mlx that referenced this pull request Oct 30, 2025
* implement batch rope for Metal

* cuda rope (ml-explore#2576)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants