Add ragged paged attention #8659

vanbasten23 · 2025-01-31T06:03:49Z

Test plan:

LIBTPU_INIT_ARGS=--xla_tpu_scoped_vmem_limit_kib=65536  python /workspaces/persist/pytorch/xla/test/test_ragged_paged_attention_kernel.py 2>&1 | tee out.txt

cc: @miladm

test/test_ragged_paged_attention_kernel.py

torch_xla/experimental/pallas_kernels/ragged_paged_attention_kernel.py

bythew3i · 2025-01-31T22:09:33Z

Test plan:

LIBTPU_INIT_ARGS=--xla_tpu_scoped_vmem_limit_kib=65536  python /workspaces/persist/pytorch/xla/test/test_ragged_paged_attention_kernel.py 2>&1 | tee out.txt

How is 65536 calculated?

vanbasten23 · 2025-02-01T00:11:12Z

Test plan:

LIBTPU_INIT_ARGS=--xla_tpu_scoped_vmem_limit_kib=65536  python /workspaces/persist/pytorch/xla/test/test_ragged_paged_attention_kernel.py 2>&1 | tee out.txt

How is 65536 calculated?

I found a ticket and someone uses it. I remember the number is the vmem limit on a TPU generation.

…_extreme_one_tokens_per_sequence.

…added runtime check.

vanbasten23 · 2025-02-03T22:10:51Z

Build and test / CPU tests / test (benchmark_tests) failure is irrelevant to this PR. (A PR #8668 without any changes also fails this)

miladm · 2025-02-03T22:43:41Z

cc onduty @lsy323 to assist with the CI test failure before we merge @vanbasten23

test/tpu/run_tests.sh

bythew3i reviewed Jan 31, 2025

View reviewed changes

vanbasten23 force-pushed the xiowei/add_ragged_paged_attention branch from ad2f87c to 9e4b227 Compare February 1, 2025 00:32

bythew3i approved these changes Feb 1, 2025

View reviewed changes

vanbasten23 added 8 commits February 3, 2025 05:40

add the first version. All tests pass except for test_paged_attention…

b129e9a

…_extreme_one_tokens_per_sequence.

all tests passed.

6c3bf73

all tests passed except for one test which oom'ed.

dbc3fee

Improved the tests and all tests passed except for the OOM one. Also …

e589c86

…added runtime check.

clean up

9b3cdab

linter

ab69feb

address pr comments

cd65cc4

fix the rest of comments

7fe5071

vanbasten23 force-pushed the xiowei/add_ragged_paged_attention branch from 9e4b227 to 7fe5071 Compare February 3, 2025 05:41

Trigger CI

5170402

vanbasten23 requested review from lsy323 and miladm February 3, 2025 21:31

miladm approved these changes Feb 3, 2025

View reviewed changes

miladm assigned vanbasten23 Feb 3, 2025

miladm added the pallas label Feb 3, 2025

miladm reviewed Feb 3, 2025

View reviewed changes

test/tpu/run_tests.sh Show resolved Hide resolved

vanbasten23 merged commit 8480094 into master Feb 4, 2025
11 of 12 checks passed

miladm mentioned this pull request Feb 7, 2025

[WIP] [V1] TPU support vllm-project/vllm#11936

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add ragged paged attention #8659

Add ragged paged attention #8659

Uh oh!

vanbasten23 commented Jan 31, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

bythew3i commented Jan 31, 2025

Uh oh!

vanbasten23 commented Feb 1, 2025

Uh oh!

vanbasten23 commented Feb 3, 2025

Uh oh!

miladm commented Feb 3, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Add ragged paged attention #8659

Add ragged paged attention #8659

Uh oh!

Conversation

vanbasten23 commented Jan 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

bythew3i commented Jan 31, 2025

Uh oh!

vanbasten23 commented Feb 1, 2025

Uh oh!

vanbasten23 commented Feb 3, 2025

Uh oh!

miladm commented Feb 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

vanbasten23 commented Jan 31, 2025 •

edited

Loading

miladm commented Feb 3, 2025 •

edited

Loading