Skip to content

fix: close writer and notify relay on target EOF in edge agent#77

Merged
elikoga merged 1 commit intomainfrom
fix-edge-agent-fd-leak-on-eof
Mar 27, 2026
Merged

fix: close writer and notify relay on target EOF in edge agent#77
elikoga merged 1 commit intomainfrom
fix-edge-agent-fd-leak-on-eof

Conversation

@elikoga
Copy link
Copy Markdown
Member

@elikoga elikoga commented Mar 27, 2026

Problem

read_from_tcp_and_send() exited on empty read (remote EOF) without closing the asyncio.StreamWriter or removing the entry from active_connections. Every SSH/VNC session that ends naturally leaked one file descriptor.

On the default systemd soft limit of 1024 fds the agent exhausts its budget; asyncio.open_connection() then fails with [Errno 16] Device or resource busy (Python 3.13 happy-eyeballs translates EMFILE as EBUSY). The agent stops proxying all connections until restarted.

Observed in production on a HomePi4 running thymis-agent continuously — terminal/SSH relay broke after accumulated sessions hit the fd ceiling.

Fix

Wrap the read loop in try/finally that:

  • calls writer.close() to release the fd
  • pops from active_connections so stale entries don't accumulate
  • sends EtRConnectionResetMessage to notify the relay the target closed its side (guarded against a closed websocket)

read_from_tcp_and_send() broke on empty read (remote EOF) without
closing the asyncio StreamWriter or removing the entry from
active_connections. Each SSH/VNC session that ends naturally leaked one
file descriptor. On a default systemd soft limit of 1024 fds the process
eventually exhausts its budget; asyncio.open_connection() then fails
with [Errno 16] Device or resource busy (Python 3.13 happy-eyeballs
translates EMFILE as EBUSY).

Fix: wrap the read loop in try/finally that
  - calls writer.close() to release the fd
  - pops from active_connections so stale entries don't accumulate
  - sends EtRConnectionResetMessage to notify the relay the target
    closed its side (guarded against a closed websocket)
elikoga added a commit to Thymis-io/thymis that referenced this pull request Mar 27, 2026
Pins to Thymis-io/http-network-relay#77 which fixes the fd leak in
edge_agent read_from_tcp_and_send(). Revert the pin to main once that
PR is merged.
@elikoga
Copy link
Copy Markdown
Member Author

elikoga commented Mar 27, 2026

Can't hurt, right?

@elikoga elikoga added this pull request to the merge queue Mar 27, 2026
Merged via the queue into main with commit 6cb17b5 Mar 27, 2026
3 checks passed
@elikoga elikoga deleted the fix-edge-agent-fd-leak-on-eof branch March 27, 2026 11:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant