Add batch offsets for mx.fast.rope by awni · Pull Request #2564 · ml-explore/mlx

awni · 2025-09-03T02:41:22Z

This changes mx.fast.rope to accept a vector of offsets instead of just a single scalar. This is necessary for batch-decoding where the examples in the batch can be at different positions/ offsets.

angeloskath

Looks great.

Nit / rant:

I guess it is inside the _impl because it is shared compared to freqs which is outside. The only reason I am mentioning this is because it felt a bit weird to complicate the _impl although it hasn't really changed (still one offset per call). Otoh I think carrying the other indices and all would be equally complicated.

ZelinMa557 · 2025-09-09T03:50:23Z

May I ask, when can mlx users use this feature via pip install mlx? This feature is really important for batch infer engine, looking forward to use it!

awni · 2025-09-09T21:59:21Z

We will aim to get a patch release out this week.

This test requires the latest version of mlx 0.29.1, since they just merged support for this in mlx a week ago: ml-explore/mlx#2564 I verified that the other tests still pass with the version upgrade.

* implement batch rope for Metal * cuda rope (ml-explore#2576)

awni force-pushed the batch_rope branch 3 times, most recently from c076794 to aa21ddf Compare September 8, 2025 14:15

implement batch rope for Metal

d92d0fd

awni force-pushed the batch_rope branch from aa21ddf to d92d0fd Compare September 8, 2025 15:06

cuda rope (#2576)

496890d

awni force-pushed the batch_rope branch from f8a6084 to 496890d Compare September 8, 2025 16:38

awni requested review from angeloskath and zcbenz September 8, 2025 16:38

angeloskath approved these changes Sep 8, 2025

View reviewed changes

awni merged commit 17310d9 into main Sep 9, 2025
6 checks passed

awni deleted the batch_rope branch September 9, 2025 00:35

ekzhang mentioned this pull request Sep 17, 2025

Day 6, task 1 tests - RoPE with multiple offsets skyzh/tiny-llm#68

Merged

faisalmemon pushed a commit to faisalmemon/mlx that referenced this pull request Oct 30, 2025

Add batch offsets for mx.fast.rope (ml-explore#2564)

0f5b9a8

* implement batch rope for Metal * cuda rope (ml-explore#2576)

BrewTestBot mentioned this pull request Nov 20, 2025

mlx 0.30.0 Homebrew/homebrew-core#255173

Merged

1 task

TianyiZhao1437 mentioned this pull request Jan 8, 2026

perf(mlx): rope supports batch offset GradientHQ/parallax#379

Merged

11 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add batch offsets for mx.fast.rope#2564

Add batch offsets for mx.fast.rope#2564
awni merged 2 commits intomainfrom
batch_rope

awni commented Sep 3, 2025 •

edited

Loading

Uh oh!

angeloskath left a comment

Uh oh!

Uh oh!

ZelinMa557 commented Sep 9, 2025

Uh oh!

awni commented Sep 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

awni commented Sep 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

angeloskath left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ZelinMa557 commented Sep 9, 2025

Uh oh!

awni commented Sep 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

awni commented Sep 3, 2025 •

edited

Loading