Fix: Start SSE keepalive when first handler is registered#83
Fix: Start SSE keepalive when first handler is registered#83dgilperez wants to merge 3 commits intozoedsoupe:mainfrom
Conversation
## Problem
SSE keepalive messages were never sent when the StreamableHTTP transport
started with no registered handlers. The keepalive timer was only scheduled
in init/1, but if no SSE handlers existed at startup, subsequent handler
registrations did not trigger the keepalive loop.
This caused SSE connections to timeout and disconnect, particularly
affecting HTTP-based MCP clients like Claude Code.
## Solution
Modified handle_call({:register_sse_handler, ...}) to check if this is
the first SSE handler being registered (map_size(state.sse_handlers) == 0).
If so, and keepalive is enabled, schedule_keepalive/1 is called to start
the keepalive message loop.
## Testing
Added comprehensive test suite in streamable_http_keepalive_test.exs with
4 test cases:
- Keepalive messages are sent when handler is registered
- Keepalive continues with multiple handlers
- Keepalive stops when all handlers are unregistered
- Keepalive starts immediately when first handler registered after startup
All 523 tests pass.
## Related Issues
May be related to:
- zoedsoupe#81 - SSE Response Race Condition
- cloudwalk/hermes-mcp#227 - Streamable HTTP transport times out
WalkthroughThis change updates the StreamableHTTP transport to schedule SSE keepalive messages via a delayed self-message when the first SSE handler is registered. register_sse_handler now builds and returns an updated state ( Sequence Diagram(s)sequenceDiagram
participant Client
participant StreamableHTTP
participant KeepaliveTimer
Client->>StreamableHTTP: register_sse_handler()
StreamableHTTP->>StreamableHTTP: register handler -> new_state
StreamableHTTP->>StreamableHTTP: should_keepalive?(new_state)?
alt keepalive should start (prev handlers == 0 && enabled)
StreamableHTTP->>KeepaliveTimer: schedule_keepalive(new_state.keepalive_interval)
end
StreamableHTTP-->>Client: {:reply, :ok, new_state}
Note over KeepaliveTimer,StreamableHTTP: later
KeepaliveTimer->>StreamableHTTP: :send_keepalive (self-message)
StreamableHTTP-->>Client: deliver keepalive to SSE handlers
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
lib/anubis/server/transport/streamable_http.ex (1)
227-229: P3 (nitpick): Init keepalive check is now effectively a no-opWith the fix in
handle_call, thisif should_keepalive?(state)block ininit/1will always evaluate tofalsesincesse_handlersstarts empty. It's harmless and could be useful if the transport ever starts with pre-registered handlers, so keeping it is fine—just noting for clarity.
🤖 Fix all issues with AI agents
In `@test/anubis/server/transport/streamable_http_keepalive_test.exs`:
- Around line 141-147: The helper flush_mailbox/0 is implemented inline in the
test but may be duplicated in other tests; if it's reused elsewhere, extract the
flush_mailbox/0 function into a shared test helper under test/support (e.g., a
module like TestSupport.Mailbox or in support/test_helpers.exs), import or alias
that module in test/anubis/server/transport/streamable_http_keepalive_test.exs
and replace the inline definition with a call to the shared flush_mailbox/0 to
avoid duplication and follow project test support guidelines.
- Around line 44-48: Replace the manual sleep + assert_received pattern with
Elixir's assert_receive to make the test more idiomatic and resilient: remove
Process.sleep(250) and assert_received :sse_keepalive and instead call
assert_receive(:sse_keepalive, 250) (or increase timeout if needed); apply the
same change for the other tests that wait for :sse_keepalive to simplify timing
and avoid explicit sleeps.
test/anubis/server/transport/streamable_http_keepalive_test.exs
Outdated
Show resolved
Hide resolved
- Replace Process.sleep + assert_received with assert_receive for more idiomatic timing - Add documentation to flush_mailbox/0 helper - Improve test reliability with proper timeout handling
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@test/anubis/server/transport/streamable_http_keepalive_test.exs`:
- Around line 97-104: Replace the explicit Process.sleep + refute_received
pattern with ExUnit's timeout-aware negative assertion: keep the flush_mailbox()
call, remove Process.sleep(250) and the subsequent refute_received
:sse_keepalive, and instead call refute_receive :sse_keepalive, 250 to assert no
keepalive arrives within the same interval; locate the assertions around
flush_mailbox/Process.sleep/refute_received in the test and update them to use
refute_receive with the timeout.
test/anubis/server/transport/streamable_http_keepalive_test.exs
Outdated
Show resolved
Hide resolved
- Add comprehensive docstrings to private functions schedule_keepalive/1 and should_keepalive?/1 - Replace Process.sleep + refute_received with refute_receive for more idiomatic test patterns - Addresses CodeRabbit review feedback All changes maintain backward compatibility and improve code documentation.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@lib/anubis/server/transport/streamable_http.ex`:
- Around line 258-266: The keepalive scheduling can overlap when adding the
first SSE handler; modify the logic in the block that calls schedule_keepalive/1
to cancel any existing timer ref before scheduling and store the new timer ref
in state (e.g., add a keepalive_timer or keepalive_ref field to the state
struct/map). Use Process.cancel_timer/1 (or :erlang.cancel_timer) to cancel the
prior timer if present, call schedule_keepalive(new_state.keepalive_interval) to
obtain the new timer ref, and update new_state with that ref so that
should_keepalive?/1, schedule_keepalive/1 and future handler churn always
operate against a single active timer.
♻️ Duplicate comments (1)
test/anubis/server/transport/streamable_http_keepalive_test.exs (1)
130-139: P3: Extractflush_mailbox/0into a test/support helper module
Inline helpers are discouraged for reuse/consistency—please move this intotest/supportand import it here. As per coding guidelines, ...
| new_state = %{state | sse_handlers: sse_handlers} | ||
|
|
||
| # Start keepalive when first SSE handler is registered | ||
| # This fixes the bug where keepalive never starts if server has no handlers at init | ||
| if map_size(state.sse_handlers) == 0 and should_keepalive?(new_state) do | ||
| schedule_keepalive(new_state.keepalive_interval) | ||
| end | ||
|
|
||
| {:reply, :ok, new_state} |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
P3: Guard against overlapping keepalive timers during handler churn 😎
Line 262 schedules a new keepalive on the 0→1 transition; if a prior :send_keepalive is still pending from an earlier loop, this can create overlapping timers and duplicate keepalive sends. Consider tracking/canceling the timer ref so only one loop is active.
🤖 Prompt for AI Agents
In `@lib/anubis/server/transport/streamable_http.ex` around lines 258 - 266, The
keepalive scheduling can overlap when adding the first SSE handler; modify the
logic in the block that calls schedule_keepalive/1 to cancel any existing timer
ref before scheduling and store the new timer ref in state (e.g., add a
keepalive_timer or keepalive_ref field to the state struct/map). Use
Process.cancel_timer/1 (or :erlang.cancel_timer) to cancel the prior timer if
present, call schedule_keepalive(new_state.keepalive_interval) to obtain the new
timer ref, and update new_state with that ref so that should_keepalive?/1,
schedule_keepalive/1 and future handler churn always operate against a single
active timer.
There was a problem hiding this comment.
@coderabbitai could you implement minor change/check?
There was a problem hiding this comment.
@zoedsoupe Sure! I'll implement the change to prevent overlapping keepalive timers by tracking the timer reference in state and canceling any existing timer before scheduling a new one. 😎
Task: Track and cancel keepalive timer refs to prevent overlaps
- Add
keepalive_timer_reffield to state - Cancel existing timer before scheduling new one
- Update all three places that schedule keepalive: init/1, register_sse_handler, and send_keepalive loop
🧠 Learnings used
Learnt from: CR
Repo: zoedsoupe/anubis-mcp PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-09T00:46:26.321Z
Learning: Applies to **/*.exs : Always return updated state from GenServer handle_* callbacks and use `:hibernate` for reduced memory footprint in init callback
Learnt from: CR
Repo: zoedsoupe/anubis-mcp PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-09T00:46:26.321Z
Learning: Applies to **/*.exs : Use `Anubis.MCP.ID` for generating request/response IDs with functions `ID.generate_request_id/0`, `ID.generate_error_id/0`, and use `Process.send_after/3` for request timeouts
💡 For code changes across multiple files, use @coderabbitai plan in an issue comment
There was a problem hiding this comment.
@dgilperez Thank you for the confirmation! I'll proceed with implementing the keepalive timer tracking to prevent overlaps. 🚀
Task: Track and cancel keepalive timer refs to prevent overlaps
- Add
keepalive_timer_reffield to state - Cancel existing timer before scheduling new one
- Update all places that schedule keepalive: init/1, register_sse_handler, send_keepalive loop, and unregister_sse_handler
🧠 Learnings used
Learnt from: CR
Repo: zoedsoupe/anubis-mcp PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-09T00:46:26.321Z
Learning: Applies to **/*.exs : Always return updated state from GenServer handle_* callbacks and use `:hibernate` for reduced memory footprint in init callback
💡 For code changes across multiple files, use @coderabbitai plan in an issue comment
Problem
SSE keepalive messages were never sent when the StreamableHTTP transport started with no registered handlers. The keepalive timer was only scheduled in
init/1, but if no SSE handlers existed at startup, subsequent handler registrations did not trigger the keepalive loop.This caused SSE connections to timeout and disconnect, particularly affecting HTTP-based MCP clients like Claude Code.
Solution
Modified
handle_call({:register_sse_handler, ...})to check if this is the first SSE handler being registered (map_size(state.sse_handlers) == 0). If so, and keepalive is enabled,schedule_keepalive/1is called to start the keepalive message loop.Testing
Added comprehensive test suite in
streamable_http_keepalive_test.exswith 4 test cases:All 523 tests pass (4 new tests added).
Related Issues
May be related to:
Summary by CodeRabbit
Bug Fixes
Tests
✏️ Tip: You can customize this high-level summary in your review settings.