io: always cleanup AsyncFd registration list on deregister by F4RAN · Pull Request #7773 · tokio-rs/tokio

F4RAN · 2025-12-12T18:59:13Z

Fixes memory leak when fd is closed before AsyncFd drop.

Motivation

When a file descriptor is closed before dropping AsyncFd, OS deregistration fails and causes an early return, preventing cleanup of the internal registration list. This leaks ScheduledIo objects over time.

Solution

Always clean up the internal registration list, even if OS deregistration fails. Store the OS result, perform cleanup, then return the error. This is safe because the list is Tokio's internal tracking - if OS deregister fails, the OS already doesn't track it, so cleanup doesn't break memory safety.

Includes a test reproducing the bug scenario.

Fixes memory leak when fd is closed before AsyncFd drop. Fixes: tokio-rs#7563

ADD-SP

if OS deregister fails, the OS already doesn't track it, so cleanup doesn't break memory safety.

Could you please explain why? I'm curious on it.

tokio/tests/io_async_fd.rs

ADD-SP

Only cleaning up the intrusive list may not be a proper fix of this issue. There are some questions we need to answer.

Shall we propagate the error to the downstream user?
When / When not propagate the error?
How to propagate this error?
Will our fix cause a breaking change?

F4RAN · 2025-12-13T09:58:11Z

Hi @ADD-SP,

Thanks for the feedback! I've updated the PR to address your questions.

Test Infrastructure Added

Added test-only methods to verify the fix:

registration_set.rs: pending_release_count() and total_registration_count() (needed since LinkedList has no len())
driver.rs: Wrapper methods
handle.rs: Methods to expose counts for integration tests

All gated with #[cfg(feature = "full")] and #[doc(hidden)]. Kept /// TEST PURPOSE RELATED TO PR #7773 comments for easy removal later.

Test Evolution

Initial test only exercised the code path. Updated to check registration count after each iteration and assert it doesn't grow (indicating leak).

Platform-Specific Behavior

The leak cannot be reproduced on macOS. On Linux with epoll:

With buggy code: Test fails, detecting 30 leaked ScheduledIo objects
With fix: Test passes, count stays at baseline

The test is Linux-specific for leak detection but exercises the code path on macOS too.

Why Cleanup After OS Failure is Safe

if OS deregister fails, the OS already doesn't track it, so cleanup doesn't break memory safety.

Could you please explain why? I'm curious on it.

When mio::Registry::deregister() fails (typically EBADF for closed fds), the OS kernel already doesn't track this fd in its polling mechanism (epoll/kqueue). The ScheduledIo in Tokio's internal registrations list is just our bookkeeping - removing it doesn't affect the kernel or other resources. Since the OS doesn't know about the fd anymore, cleaning up our internal tracking is safe and doesn't break memory safety.

Remaining Questions

Only cleaning up the intrusive list may not be a proper fix of this issue. There are some questions we need to answer.

Shall we propagate the error to the downstream user?

When / When not propagate the error?

How to propagate this error?

Will our fix cause a breaking change?

Per your comment in #7563, I see two options for into_inner(): (1)add try_into_inner() (non-breaking) or (2) panic on error. I recommend Option 1 for non-breaking behavior and user choice, but I can implement Option 2 if you prefer.

Shall we propagate the error to the downstream user?
Yes, but make it optional via try_into_inner() so users can choose.
When / When not propagate the error?
Not in Drop (can't return errors), but yes in into_inner() via optional try_into_inner() while keeping into_inner() ignoring errors for compatibility.
How to propagate this error?
Add try_into_inner() -> Result<T, (T, io::Error)> (Option 1).
Will our fix cause a breaking change?
No.

Thanks again!

tokio/src/runtime/io/registration_set.rs

tokio/tests/io_async_fd.rs

tokio/src/runtime/io/driver.rs

Co-authored-by: Martin Grigorov <martin-g@users.noreply.github.com>

tokio/src/runtime/handle.rs

…c exposure)

tokio/Cargo.toml

F4RAN · 2026-01-01T06:05:15Z

Can Miri or address sanitizer capture this memory leak?
Hi again @ADD-SP,
Thanks for your idea.

I'll be appreciated if you check my rss-based test:

Summary

I tested several approaches as suggested:

Approach	Result
LeakSanitizer	❌ Cannot detect - objects are still reachable via internal linked list (logical leak, not unreachable memory)
Miri	❌ Cannot run - requires system calls (`epoll_ctl`, `libc::close`, sockets)

Since this is a "logical leak" (objects stuck in a data structure, not truly unreachable), standard leak detectors don't work.

Solution: RSS-based memory test

I implemented a test that monitors RSS (Resident Set Size) before and after running 5000 iterations. No internal API access needed - just reads /proc/self/statm.

Linux results:

Without fix: FAILED - RSS grew by 1920KB (leak detected!)
With fix: PASSED - RSS stable

The test is gated with #[cfg(target_os = "linux")] since it reads /proc/self/statm.

Instead of checking absolute RSS growth (which varies with allocator behavior), this test now runs multiple phases and checks if memory stabilizes. A real leak causes unbounded growth across all phases; fixed code stabilizes as memory is reused. This approach is more robust across different CI environments where allocator behavior may differ.

Darksonn · 2026-01-01T18:10:40Z

Miri does support epoll.

tokio/src/runtime/io/driver.rs

tokio/Cargo.toml

Co-authored-by: Alice Ryhl <aliceryhl@google.com>

F4RAN · 2026-01-02T04:47:22Z

Miri does support epoll.

Hi @Darksonn.
I tested with Miri on Linux. Both with and without the fix, Miri reports no leak:

test memory_leak_miri_check ... ok
Miri detects unreachable memory, but this is a logical leak - the ScheduledIo objects remain reachable through the internal registrations linked list. They're not orphaned allocations, they're just never removed when the fd is closed before drop. Same reason LSan couldn't detect it. The RSS-based test remains the practical way to verify this fix works - it detects unbounded memory growth across multiple phases.

Darksonn · 2026-01-02T11:26:04Z

Miri detects unreachable memory

Are you saying that miri detects it, and that miri doesn't detect it when the fix is applied? That sounds like a pretty good way to test it.

F4RAN · 2026-01-02T14:35:54Z

Miri detects unreachable memory

Are you saying that miri detects it, and that miri doesn't detect it when the fix is applied? That sounds like a pretty good way to test it.

To clarify: Miri cannot detect this leak - neither with nor without the fix. Miri is designed to detect unreachable memory (orphaned allocations). But this is a logical leak where objects remain reachable through the internal registrations linked list - they're just never removed. Since they're still reachable, Miri sees no problem. Same reason LSan couldn't detect it. The RSS-based test is the practical solution since it measures actual memory growth.

Both with and without the fix, Miri reports no leak

Darksonn · 2026-01-02T14:46:13Z

Relying on RSS is going to be extremely fragile. Your memory allocator may keep memory around even if the program has freed it, and this still counts in RSS.

If we're going the route of measuring memory, then please declare a #[global_allocator] that increments/decrements a static AtomicUsize for tracking the memory actually in use (it may forward calls into the system allocator to perform actual allocation), and use this to measure the amount instead.

The test needs to be moved to its own file so the #[global_allocator] does not affect other tests.

F4RAN · 2026-01-05T06:00:16Z

Relying on RSS is going to be extremely fragile. Your memory allocator may keep memory around even if the program has freed it, and this still counts in RSS.

If we're going the route of measuring memory, then please declare a #[global_allocator] that increments/decrements a static AtomicUsize for tracking the memory actually in use (it may forward calls into the system allocator to perform actual allocation), and use this to measure the amount instead.

The test needs to be moved to its own file so the #[global_allocator] does not affect other tests.

@Darksonn, Implemented a custom #[global_allocator] that tracks allocations via a static AtomicUsize, forwarding to the system allocator. The test is in io_async_fd_memory_leak.rs so the allocator only affects that file.
Verified on Linux: without the fix, the test fails with ~256KB growth per phase; with the fix, it passes with stable memory.

Darksonn

Thanks for adding the new test. One nit below.

tokio/tests/io_async_fd_memory_leak.rs

- Add feature flag check to #[cfg] - Make allocated counter a struct field - Add error handling for fcntl in set_nonblocking

This PR contains the following updates: | Package | Type | Update | Change | |---|---|---|---| | [tokio](https://tokio.rs) ([source](https://github.com/tokio-rs/tokio)) | dependencies | minor | `1.49.0` → `1.50.0` | --- ### Release Notes <details> <summary>tokio-rs/tokio (tokio)</summary> ### [`v1.50.0`](https://github.com/tokio-rs/tokio/releases/tag/tokio-1.50.0): Tokio v1.50.0 [Compare Source](tokio-rs/tokio@tokio-1.49.0...tokio-1.50.0) ### 1.50.0 (Mar 3rd, 2026) ##### Added - net: add `TcpStream::set_zero_linger` ([#7837]) - rt: add `is_rt_shutdown_err` ([#7771]) ##### Changed - io: add optimizer hint that `memchr` returns in-bounds pointer ([#7792]) - io: implement vectored writes for `write_buf` ([#7871]) - runtime: panic when `event_interval` is set to 0 ([#7838]) - runtime: shorten default thread name to fit in Linux limit ([#7880]) - signal: remember the result of `SetConsoleCtrlHandler` ([#7833]) - signal: specialize windows `Registry` ([#7885]) ##### Fixed - io: always cleanup `AsyncFd` registration list on deregister ([#7773]) - macros: remove (most) local `use` declarations in `tokio::select!` ([#7929]) - net: fix `GET_BUF_SIZE` constant for `target_os = "android"` ([#7889]) - runtime: avoid redundant unpark in current\_thread scheduler ([#7834]) - runtime: don't park in `current_thread` if `before_park` defers waker ([#7835]) - io: fix write readiness on ESP32 on short writes ([#7872]) - runtime: wake deferred tasks before entering `block_in_place` ([#7879]) - sync: drop rx waker when oneshot receiver is dropped ([#7886]) - runtime: fix double increment of `num_idle_threads` on shutdown ([#7910], [#7918], [#7922]) ##### Unstable - fs: check for io-uring opcode support ([#7815]) - runtime: avoid lock acquisition after uring init ([#7850]) ##### Documented - docs: update outdated unstable features section ([#7839]) - io: clarify the behavior of `AsyncWriteExt::shutdown()` ([#7908]) - io: explain how to flush stdout/stderr ([#7904]) - io: fix incorrect and confusing `AsyncWrite` documentation ([#7875]) - rt: clarify the documentation of `Runtime::spawn` ([#7803]) - rt: fix missing quotation in docs ([#7925]) - runtime: correct the default thread name in docs ([#7896]) - runtime: fix `event_interval` doc ([#7932]) - sync: clarify RwLock fairness documentation ([#7919]) - sync: clarify that `recv` returns `None` once closed and no more messages ([#7920]) - task: clarify when to use `spawn_blocking` vs dedicated threads ([#7923]) - task: doc that task drops before `JoinHandle` completion ([#7825]) - signal: guarantee that listeners never return `None` ([#7869]) - task: fix task module feature flags in docs ([#7891]) - task: fix two typos ([#7913]) - task: improve the docs of `Builder::spawn_local` ([#7828]) - time: add docs about auto-advance and when to use sleep ([#7858]) - util: fix typo in docs ([#7926]) [#7771]: tokio-rs/tokio#7771 [#7773]: tokio-rs/tokio#7773 [#7792]: tokio-rs/tokio#7792 [#7803]: tokio-rs/tokio#7803 [#7815]: tokio-rs/tokio#7815 [#7825]: tokio-rs/tokio#7825 [#7828]: tokio-rs/tokio#7828 [#7833]: tokio-rs/tokio#7833 [#7834]: tokio-rs/tokio#7834 [#7835]: tokio-rs/tokio#7835 [#7837]: tokio-rs/tokio#7837 [#7838]: tokio-rs/tokio#7838 [#7839]: tokio-rs/tokio#7839 [#7850]: tokio-rs/tokio#7850 [#7858]: tokio-rs/tokio#7858 [#7869]: tokio-rs/tokio#7869 [#7871]: tokio-rs/tokio#7871 [#7872]: tokio-rs/tokio#7872 [#7875]: tokio-rs/tokio#7875 [#7879]: tokio-rs/tokio#7879 [#7880]: tokio-rs/tokio#7880 [#7885]: tokio-rs/tokio#7885 [#7886]: tokio-rs/tokio#7886 [#7889]: tokio-rs/tokio#7889 [#7891]: tokio-rs/tokio#7891 [#7896]: tokio-rs/tokio#7896 [#7904]: tokio-rs/tokio#7904 [#7908]: tokio-rs/tokio#7908 [#7910]: tokio-rs/tokio#7910 [#7913]: tokio-rs/tokio#7913 [#7918]: tokio-rs/tokio#7918 [#7919]: tokio-rs/tokio#7919 [#7920]: tokio-rs/tokio#7920 [#7922]: tokio-rs/tokio#7922 [#7923]: tokio-rs/tokio#7923 [#7925]: tokio-rs/tokio#7925 [#7926]: tokio-rs/tokio#7926 [#7929]: tokio-rs/tokio#7929 [#7932]: tokio-rs/tokio#7932 </details> --- ### Configuration 📅 **Schedule**: Branch creation - Between 12:00 AM and 03:59 AM ( * 0-3 * * * ) in timezone Pacific/Auckland, Automerge - Between 12:00 AM and 03:59 AM ( * 0-3 * * * ) in timezone Pacific/Auckland. 🚦 **Automerge**: Disabled because a matching PR was automerged previously. ♻ **Rebasing**: Whenever PR is behind base branch, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this PR and you won't be reminded about this update again. --- - [ ] If you want to rebase/retry this PR, check this box --- This PR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate).  Reviewed-on: https://harton.dev/project-neon/neonfs/pulls/53 Co-authored-by: Renovate Bot <bot@harton.nz> Co-committed-by: Renovate Bot <bot@harton.nz>

F4RAN added 2 commits December 12, 2025 22:28

io: always cleanup AsyncFd registration list on deregister

931635d

Fixes memory leak when fd is closed before AsyncFd drop. Fixes: tokio-rs#7563

fix: formatter issues

3d7d87d

ADD-SP reviewed Dec 13, 2025

View reviewed changes

tokio/tests/io_async_fd.rs Outdated Show resolved Hide resolved

ADD-SP reviewed Dec 13, 2025

View reviewed changes

ADD-SP added A-tokio Area: The main tokio crate S-waiting-on-author Status: awaiting some action (such as code changes) from the PR or issue author. M-io Module: tokio/io labels Dec 13, 2025

F4RAN added 5 commits December 13, 2025 13:00

test: in linux environment

b6452c7

test: linux test with fix

bf5c706

chore: remove additional method

e659d60

chore: remove additional debug comments

b44e56d

fix:formatter

94f46b9

fix: style: fix clippy warnings and format code

0a0f94e

F4RAN requested a review from ADD-SP December 13, 2025 10:55

claude bot mentioned this pull request Dec 15, 2025

7773: io: always cleanup AsyncFd registration list on deregister martin-augment/tokio#33

Open

martin-g reviewed Dec 15, 2025

View reviewed changes

tokio/src/runtime/io/registration_set.rs Outdated Show resolved Hide resolved

tokio/tests/io_async_fd.rs Outdated Show resolved Hide resolved

tokio/tests/io_async_fd.rs Show resolved Hide resolved

tokio/src/runtime/io/driver.rs Outdated Show resolved Hide resolved

F4RAN and others added 6 commits December 15, 2025 15:02

Update tokio/src/runtime/io/registration_set.rs

c4141e3

Co-authored-by: Martin Grigorov <martin-g@users.noreply.github.com>

Update tokio/src/runtime/io/driver.rs

125dc5d

Co-authored-by: Martin Grigorov <martin-g@users.noreply.github.com>

Merge branch 'master' into 7563-fix-asyncfd-leak

4d30241

fix: remove additional imports

45d3da2

restore AsyncFd::try_with_interest()

ab770fa

style: run formatter

e36a489

ADD-SP reviewed Dec 16, 2025

View reviewed changes

tokio/src/runtime/handle.rs Outdated Show resolved Hide resolved

F4RAN added 2 commits December 16, 2025 20:24

test(internals): gate test-only APIs behind __internal_test (no publi…

0b0495f

…c exposure)

fix: spelling error is solved using backticks

08d5645

F4RAN requested review from ADD-SP and martin-g December 16, 2025 17:24

martin-g reviewed Dec 16, 2025

View reviewed changes

tokio/Cargo.toml Outdated Show resolved Hide resolved

F4RAN added 2 commits January 1, 2026 09:36

fix: resolve clippy format

abd4d91

F4RAN force-pushed the 7563-fix-asyncfd-leak branch from cc2358c to 3b8d349 Compare January 1, 2026 06:29

Merge branch 'master' into 7563-fix-asyncfd-leak

3c18df5

Darksonn reviewed Jan 1, 2026

View reviewed changes

tokio/src/runtime/io/driver.rs Outdated Show resolved Hide resolved

tokio/Cargo.toml Show resolved Hide resolved

F4RAN and others added 2 commits January 2, 2026 07:45

Update tokio/src/runtime/io/driver.rs

3eba842

Co-authored-by: Alice Ryhl <aliceryhl@google.com>

fix: additional line in Cargo file

ffe88ce

F4RAN requested a review from Darksonn January 2, 2026 04:51

F4RAN added 4 commits January 5, 2026 09:13

test: add custom allocator memory leak test for issue tokio-rs#7563

b8ef07c

test: revert to check in linux machine

d84af1c

test: fix test

6fa3271

fix: inline format args to satisfy clippy

fdf771e

Darksonn approved these changes Jan 5, 2026

View reviewed changes

tokio/tests/io_async_fd_memory_leak.rs Outdated Show resolved Hide resolved

claude bot mentioned this pull request Jan 5, 2026

7773: io: always cleanup AsyncFd registration list on deregister martin-augment/tokio#47

Open

martin-g reviewed Jan 5, 2026

View reviewed changes

tokio/tests/io_async_fd_memory_leak.rs Outdated Show resolved Hide resolved

tokio/tests/io_async_fd_memory_leak.rs Outdated Show resolved Hide resolved

F4RAN added 2 commits January 5, 2026 12:28

test: address review nits for io_async_fd_memory_leak test

9879ccc

- Add feature flag check to #[cfg] - Make allocated counter a struct field - Add error handling for fcntl in set_nonblocking

fix: allocation problem

2977a5f

F4RAN requested review from Darksonn and martin-g January 5, 2026 09:03

martin-g approved these changes Jan 5, 2026

View reviewed changes

Darksonn merged commit 1280cf8 into tokio-rs:master Jan 14, 2026
88 checks passed

ADD-SP mentioned this pull request Feb 25, 2026

chore: prepare Tokio v1.50.0 #7934

Merged

Uh oh!

Conversation

F4RAN commented Dec 12, 2025

Motivation

Solution

Uh oh!

ADD-SP left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ADD-SP left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

F4RAN commented Dec 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Test Infrastructure Added

Test Evolution

Platform-Specific Behavior

Why Cleanup After OS Failure is Safe

Remaining Questions

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

F4RAN commented Jan 1, 2026

Summary

Solution: RSS-based memory test

Uh oh!

Darksonn commented Jan 1, 2026

Uh oh!

Uh oh!

Uh oh!

F4RAN commented Jan 2, 2026

Uh oh!

Darksonn commented Jan 2, 2026

Uh oh!

F4RAN commented Jan 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Darksonn commented Jan 2, 2026

Uh oh!

F4RAN commented Jan 5, 2026

Uh oh!

Darksonn left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

ADD-SP left a comment •

edited

Loading

F4RAN commented Dec 13, 2025 •

edited

Loading

F4RAN commented Jan 2, 2026 •

edited

Loading