Skip to content

Fix: Start SSE keepalive when first handler is registered#83

Open
dgilperez wants to merge 3 commits intozoedsoupe:mainfrom
dgilperez:fix/sse-keepalive-not-starting
Open

Fix: Start SSE keepalive when first handler is registered#83
dgilperez wants to merge 3 commits intozoedsoupe:mainfrom
dgilperez:fix/sse-keepalive-not-starting

Conversation

@dgilperez
Copy link

@dgilperez dgilperez commented Jan 18, 2026

Problem

SSE keepalive messages were never sent when the StreamableHTTP transport started with no registered handlers. The keepalive timer was only scheduled in init/1, but if no SSE handlers existed at startup, subsequent handler registrations did not trigger the keepalive loop.

This caused SSE connections to timeout and disconnect, particularly affecting HTTP-based MCP clients like Claude Code.

Solution

Modified handle_call({:register_sse_handler, ...}) to check if this is the first SSE handler being registered (map_size(state.sse_handlers) == 0). If so, and keepalive is enabled, schedule_keepalive/1 is called to start the keepalive message loop.

Testing

Added comprehensive test suite in streamable_http_keepalive_test.exs with 4 test cases:

  • Keepalive messages are sent when handler is registered
  • Keepalive continues with multiple handlers
  • Keepalive stops when all handlers are unregistered
  • Keepalive starts immediately when first handler registered after startup

All 523 tests pass (4 new tests added).

Related Issues

May be related to:

Summary by CodeRabbit

  • Bug Fixes

    • Improved SSE keepalive behavior so the keepalive timer reliably starts when the first handler registers and stops when the last handler unregisters, ensuring consistent keepalive emission.
  • Tests

    • Added comprehensive tests covering keepalive startup timing, emissions, multiple concurrent handlers, and proper stop/start behavior.

✏️ Tip: You can customize this high-level summary in your review settings.

## Problem
SSE keepalive messages were never sent when the StreamableHTTP transport
started with no registered handlers. The keepalive timer was only scheduled
in init/1, but if no SSE handlers existed at startup, subsequent handler
registrations did not trigger the keepalive loop.

This caused SSE connections to timeout and disconnect, particularly
affecting HTTP-based MCP clients like Claude Code.

## Solution
Modified handle_call({:register_sse_handler, ...}) to check if this is
the first SSE handler being registered (map_size(state.sse_handlers) == 0).
If so, and keepalive is enabled, schedule_keepalive/1 is called to start
the keepalive message loop.

## Testing
Added comprehensive test suite in streamable_http_keepalive_test.exs with
4 test cases:
- Keepalive messages are sent when handler is registered
- Keepalive continues with multiple handlers
- Keepalive stops when all handlers are unregistered
- Keepalive starts immediately when first handler registered after startup

All 523 tests pass.

## Related Issues
May be related to:
- zoedsoupe#81 - SSE Response Race Condition
- cloudwalk/hermes-mcp#227 - Streamable HTTP transport times out
@coderabbitai
Copy link

coderabbitai bot commented Jan 18, 2026

Walkthrough

This change updates the StreamableHTTP transport to schedule SSE keepalive messages via a delayed self-message when the first SSE handler is registered. register_sse_handler now builds and returns an updated state (new_state) and conditionally starts keepalive only on the transition from zero to one handlers when keepalive is enabled. Two new private helpers, schedule_keepalive/1 and should_keepalive?/1, encapsulate scheduling and the keepalive-enabled check. An ExUnit test module exercises keepalive emission and stopping across register/unregister scenarios.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant StreamableHTTP
    participant KeepaliveTimer

    Client->>StreamableHTTP: register_sse_handler()
    StreamableHTTP->>StreamableHTTP: register handler -> new_state
    StreamableHTTP->>StreamableHTTP: should_keepalive?(new_state)?
    alt keepalive should start (prev handlers == 0 && enabled)
        StreamableHTTP->>KeepaliveTimer: schedule_keepalive(new_state.keepalive_interval)
    end
    StreamableHTTP-->>Client: {:reply, :ok, new_state}

    Note over KeepaliveTimer,StreamableHTTP: later
    KeepaliveTimer->>StreamableHTTP: :send_keepalive (self-message)
    StreamableHTTP-->>Client: deliver keepalive to SSE handlers
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely summarizes the main fix: starting SSE keepalive when the first handler is registered after startup.
Description check ✅ Passed The description follows the template with all required sections (Problem, Solution, Rationale) fully completed, plus comprehensive testing details and related issues.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
lib/anubis/server/transport/streamable_http.ex (1)

227-229: P3 (nitpick): Init keepalive check is now effectively a no-op

With the fix in handle_call, this if should_keepalive?(state) block in init/1 will always evaluate to false since sse_handlers starts empty. It's harmless and could be useful if the transport ever starts with pre-registered handlers, so keeping it is fine—just noting for clarity.

🤖 Fix all issues with AI agents
In `@test/anubis/server/transport/streamable_http_keepalive_test.exs`:
- Around line 141-147: The helper flush_mailbox/0 is implemented inline in the
test but may be duplicated in other tests; if it's reused elsewhere, extract the
flush_mailbox/0 function into a shared test helper under test/support (e.g., a
module like TestSupport.Mailbox or in support/test_helpers.exs), import or alias
that module in test/anubis/server/transport/streamable_http_keepalive_test.exs
and replace the inline definition with a call to the shared flush_mailbox/0 to
avoid duplication and follow project test support guidelines.
- Around line 44-48: Replace the manual sleep + assert_received pattern with
Elixir's assert_receive to make the test more idiomatic and resilient: remove
Process.sleep(250) and assert_received :sse_keepalive and instead call
assert_receive(:sse_keepalive, 250) (or increase timeout if needed); apply the
same change for the other tests that wait for :sse_keepalive to simplify timing
and avoid explicit sleeps.

- Replace Process.sleep + assert_received with assert_receive for more idiomatic timing
- Add documentation to flush_mailbox/0 helper
- Improve test reliability with proper timeout handling
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@test/anubis/server/transport/streamable_http_keepalive_test.exs`:
- Around line 97-104: Replace the explicit Process.sleep + refute_received
pattern with ExUnit's timeout-aware negative assertion: keep the flush_mailbox()
call, remove Process.sleep(250) and the subsequent refute_received
:sse_keepalive, and instead call refute_receive :sse_keepalive, 250 to assert no
keepalive arrives within the same interval; locate the assertions around
flush_mailbox/Process.sleep/refute_received in the test and update them to use
refute_receive with the timeout.

- Add comprehensive docstrings to private functions schedule_keepalive/1 and should_keepalive?/1
- Replace Process.sleep + refute_received with refute_receive for more idiomatic test patterns
- Addresses CodeRabbit review feedback

All changes maintain backward compatibility and improve code documentation.
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@lib/anubis/server/transport/streamable_http.ex`:
- Around line 258-266: The keepalive scheduling can overlap when adding the
first SSE handler; modify the logic in the block that calls schedule_keepalive/1
to cancel any existing timer ref before scheduling and store the new timer ref
in state (e.g., add a keepalive_timer or keepalive_ref field to the state
struct/map). Use Process.cancel_timer/1 (or :erlang.cancel_timer) to cancel the
prior timer if present, call schedule_keepalive(new_state.keepalive_interval) to
obtain the new timer ref, and update new_state with that ref so that
should_keepalive?/1, schedule_keepalive/1 and future handler churn always
operate against a single active timer.
♻️ Duplicate comments (1)
test/anubis/server/transport/streamable_http_keepalive_test.exs (1)

130-139: P3: Extract flush_mailbox/0 into a test/support helper module
Inline helpers are discouraged for reuse/consistency—please move this into test/support and import it here. As per coding guidelines, ...

Comment on lines +258 to +266
new_state = %{state | sse_handlers: sse_handlers}

# Start keepalive when first SSE handler is registered
# This fixes the bug where keepalive never starts if server has no handlers at init
if map_size(state.sse_handlers) == 0 and should_keepalive?(new_state) do
schedule_keepalive(new_state.keepalive_interval)
end

{:reply, :ok, new_state}
Copy link

@coderabbitai coderabbitai bot Jan 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

P3: Guard against overlapping keepalive timers during handler churn 😎
Line 262 schedules a new keepalive on the 0→1 transition; if a prior :send_keepalive is still pending from an earlier loop, this can create overlapping timers and duplicate keepalive sends. Consider tracking/canceling the timer ref so only one loop is active.

🤖 Prompt for AI Agents
In `@lib/anubis/server/transport/streamable_http.ex` around lines 258 - 266, The
keepalive scheduling can overlap when adding the first SSE handler; modify the
logic in the block that calls schedule_keepalive/1 to cancel any existing timer
ref before scheduling and store the new timer ref in state (e.g., add a
keepalive_timer or keepalive_ref field to the state struct/map). Use
Process.cancel_timer/1 (or :erlang.cancel_timer) to cancel the prior timer if
present, call schedule_keepalive(new_state.keepalive_interval) to obtain the new
timer ref, and update new_state with that ref so that should_keepalive?/1,
schedule_keepalive/1 and future handler churn always operate against a single
active timer.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@coderabbitai could you implement minor change/check?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zoedsoupe Sure! I'll implement the change to prevent overlapping keepalive timers by tracking the timer reference in state and canceling any existing timer before scheduling a new one. 😎

Task: Track and cancel keepalive timer refs to prevent overlaps

  • Add keepalive_timer_ref field to state
  • Cancel existing timer before scheduling new one
  • Update all three places that schedule keepalive: init/1, register_sse_handler, and send_keepalive loop

🧠 Learnings used
Learnt from: CR
Repo: zoedsoupe/anubis-mcp PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-09T00:46:26.321Z
Learning: Applies to **/*.exs : Always return updated state from GenServer handle_* callbacks and use `:hibernate` for reduced memory footprint in init callback

Learnt from: CR
Repo: zoedsoupe/anubis-mcp PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-09T00:46:26.321Z
Learning: Applies to **/*.exs : Use `Anubis.MCP.ID` for generating request/response IDs with functions `ID.generate_request_id/0`, `ID.generate_error_id/0`, and use `Process.send_after/3` for request timeouts

💡 For code changes across multiple files, use @coderabbitai plan in an issue comment

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perfect, you can proceed

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@coderabbitai you can proceed :)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dgilperez Thank you for the confirmation! I'll proceed with implementing the keepalive timer tracking to prevent overlaps. 🚀

Task: Track and cancel keepalive timer refs to prevent overlaps

  • Add keepalive_timer_ref field to state
  • Cancel existing timer before scheduling new one
  • Update all places that schedule keepalive: init/1, register_sse_handler, send_keepalive loop, and unregister_sse_handler

🧠 Learnings used
Learnt from: CR
Repo: zoedsoupe/anubis-mcp PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-09T00:46:26.321Z
Learning: Applies to **/*.exs : Always return updated state from GenServer handle_* callbacks and use `:hibernate` for reduced memory footprint in init callback

💡 For code changes across multiple files, use @coderabbitai plan in an issue comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants