Skip to content

[TransferEngine] Clean up failed io_uring sub-batch initialization#2403

Merged
alogfans merged 1 commit into
kvcache-ai:mainfrom
he-yufeng:fix/io-uring-sub-batch-init
Jun 12, 2026
Merged

[TransferEngine] Clean up failed io_uring sub-batch initialization#2403
alogfans merged 1 commit into
kvcache-ai:mainfrom
he-yufeng:fix/io-uring-sub-batch-init

Conversation

@he-yufeng

Copy link
Copy Markdown
Contributor

Summary

  • keep the output sub-batch unchanged until io_uring_queue_init succeeds
  • return the temporary IOUringSubBatch to its slab when ring initialization fails
  • prevent callers from later freeing an uninitialized ring

Problem

allocateSubBatch published the newly allocated object before initializing its ring. If io_uring_queue_init failed, the function returned an error but left the caller holding the invalid object and leaked its slab allocation.

Fixes #2370.

Validation

  • git diff --check
  • source-path review against current upstream/main

To verify

  • TENT C++ build and CI on a Linux host with liburing
  • failure-path coverage for io_uring_queue_init, where available

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request fixes a memory leak and potential invalid state assignment in IOUringTransport::allocateSubBatch. If io_uring_queue_init fails, the allocated io_uring_batch is now properly deallocated, and the output batch reference is only assigned upon successful initialization. There are no review comments, so I have no feedback to provide.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

@codecov-commenter

Copy link
Copy Markdown

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@jfeng18 jfeng18 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Two improvements over the original code:

  1. Moves batch = io_uring_batch after io_uring_queue_init succeeds, so the caller never holds a pointer to an uninitialized ring.
  2. Adds Slab::deallocate on the failure path to prevent memory leak.

Note: our PR #2379 touches the same file but a different function (getTransferStatus, not allocateSubBatch), so no merge conflict.

@he-yufeng

Copy link
Copy Markdown
Contributor Author

Checked the red build (3.10) job. The run reached the test phase and storage_backend_e2e_test passed as test 79/80; the job then hit The operation was canceled at the 6-hour mark during the tail of the run/cleanup. The other build, build-flags, wheel, docker, docs, format, and path checks are green.

I do not have permission to rerun the failed job from this account. Could a maintainer rerun build (3.10) / the CI gate?

@alogfans alogfans merged commit bb7b906 into kvcache-ai:main Jun 12, 2026
18 of 20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[TENT] io_uring allocateSubBatch sets output before ring init

4 participants