Skip to content

Commit 6a962ef

Browse files
authored
Fix trtllm-gen attention illegal memory access (#2002)
<!-- .github/pull_request_template.md --> ## 📌 Description This PR fixes illegal memory access of trtllm-gen attention kernels. It changes the workspace buffer from `int_workspace_buffer` to `float_workspace_buffer`. `int_workspace_buffer` is a fixed sized buffer and not initialized to zero, which should not be used. <!-- What does this PR do? Briefly describe the changes and why they’re needed. --> ## 🔍 Related Issues Issue #1928 ## 🚀 Pull Request Checklist Thank you for contributing to FlashInfer! Before we review your pull request, please make sure the following items are complete. ### ✅ Pre-commit Checks - [x] I have installed `pre-commit` by running `pip install pre-commit` (or used your preferred method). - [x] I have installed the hooks with `pre-commit install`. - [x] I have run the hooks manually with `pre-commit run --all-files` and fixed any reported issues. > If you are unsure about how to set up `pre-commit`, see [the pre-commit documentation](https://pre-commit.com/). ## 🧪 Tests - [x] Tests have been added or updated as needed. - [x] All tests are passing (`unittest`, etc.). ## Reviewer Notes <!-- Optional: anything you'd like reviewers to focus on, concerns, etc. --> <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **Bug Fixes** * Fixed memory allocation in the decode module to improve computation accuracy and stability during text generation. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
1 parent bb6b620 commit 6a962ef

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

flashinfer/decode.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1988,7 +1988,7 @@ def paged_run(
19881988
q.contiguous(), # NOTE(Siyuan): without contiguous, the result is incorrect
19891989
paged_k_cache,
19901990
paged_v_cache,
1991-
int_workspace_buffer,
1991+
float_workspace_buffer,
19921992
block_tables,
19931993
kv_lens_buffer,
19941994
max_kv_len,

0 commit comments

Comments
 (0)