Skip to content

Fix ring buffer indexing in AIE AQL queue SubmitPackets#3619

Open
aamarnat wants to merge 1 commit intoROCm:developfrom
aamarnat:users/aamarnat/fix-aie-queue-ring-buffer
Open

Fix ring buffer indexing in AIE AQL queue SubmitPackets#3619
aamarnat wants to merge 1 commit intoROCm:developfrom
aamarnat:users/aamarnat/fix-aie-queue-ring-buffer

Conversation

@aamarnat
Copy link

Motivation

Fix a ring buffer overflow bug in the AIE AQL queue that causes out-of-bounds memory access when queue indices wrap around. The SubmitPackets function was using absolute packet indices (cur_id and peak_pkt_id) directly to access the ring buffer, without applying modulo. This caused crashes after the queue size (64 packets) was exceeded.

Technical Details

In AieAqlQueue::SubmitPackets():

  • Added queue_size variable to cache the queue size
  • Changed packet access from queue_base + cur_id to queue_base + (cur_id % queue_size)
  • Changed lookahead packet access from queue_base + peak_pkt_id to queue_base + (peak_pkt_id % queue_size)

The producer side (eg. code in GGML https://github.com/ypapadop-amd/ggml/blob/68e0094019edabe519d9f50240d4aa5378bd01b3/src/ggml-hsa/aie-kernel.cpp#L29-L31) already correctly uses modulo when writing packets, but the consumer side (in ROCr-runtime: here and here) was not using modulo when reading packets, causing the mismatch.

Test Plan

  • Ran a test that performs matrix multiplication operations that exceed 64 AIE kernel dispatches
  • Verified that dispatches beyond queue size 64 complete successfully
  • Previously, the test would crash after approximately 64 dispatches due to out-of-bounds memory access

Test Result

Test passes with 100+ dispatches completing successfully. Queue IDs increment past the original crash point (64) and continue to work correctly with proper ring buffer wrap-around.

Submission Checklist

@aamarnat aamarnat force-pushed the users/aamarnat/fix-aie-queue-ring-buffer branch 2 times, most recently from 4cd0de5 to 2313d04 Compare March 2, 2026 22:00
@ypapadop-amd ypapadop-amd force-pushed the users/aamarnat/fix-aie-queue-ring-buffer branch 10 times, most recently from a2d497f to 17d1074 Compare March 10, 2026 20:42
Use modulo to properly index into the ring buffer when accessing packets
in SubmitPackets. The queue indices (cur_id and peak_pkt_id) can grow
unbounded and exceed the queue size, so modulo is required to wrap
around correctly.

Without this fix, accessing packets beyond the ring buffer size causes
out-of-bounds memory access and undefined behavior after the queue wraps.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@ypapadop-amd ypapadop-amd force-pushed the users/aamarnat/fix-aie-queue-ring-buffer branch from 17d1074 to e49a874 Compare March 11, 2026 00:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants