Skip to content

feat(sglang): relay forward pass metrics to event plane#7376

Closed
ishandhanani wants to merge 3 commits intoai-dynamo:mainfrom
ishandhanani:idhanani/sglang-fpm-relay
Closed

feat(sglang): relay forward pass metrics to event plane#7376
ishandhanani wants to merge 3 commits intoai-dynamo:mainfrom
ishandhanani:idhanani/sglang-fpm-relay

Conversation

@ishandhanani
Copy link
Copy Markdown
Contributor

@ishandhanani ishandhanani commented Mar 14, 2026

Summary

  • Extend DynamoSglangPublisher with init_fpm_relay() to subscribe to sglang's ForwardPassMetrics ZMQ PUB and relay to the event plane
  • Map DYN_FORWARDPASS_METRIC_PORT env var to sglang's server_args.forward_pass_metrics_port in args.py
  • Uses the same FpmEventRelay Rust binding as the vLLM integration (feat: ForwardPassMetrics dynamo event plane integration #7250)
  • Handles DP attention topology (one relay per local DP rank)

Motivation

Companion to sgl-project/sglang#20567 which adds per-iteration ForwardPassMetrics emission from sglang's scheduler. This PR wires the Dynamo side to receive those metrics and forward them to the event plane for consumption by the planner.

Architecture

sglang scheduler (SchedulerMetricsMixin):
  _emit_forward_pass_metrics() -> _FpmPublisherThread -> ZMQ PUB (localhost)

Dynamo parent (DynamoSglangPublisher):
  init_fpm_relay() -> FpmEventRelay (ZMQ SUB) -> Event Plane (NATS)

Consumer (planner):
  FpmEventSubscriber -> decode() -> ForwardPassMetrics

Changes

  • args.py: Map DYN_FORWARDPASS_METRIC_PORT env var to server_args.forward_pass_metrics_port before engine creation
  • publisher.py: Add init_fpm_relay() method reading port from server_args, cleanup in cleanup(), call from setup_sgl_metrics()

Test plan

  • Wire compatibility verified: sglang-encoded ForwardPassMetrics decoded by Dynamo's decoder
  • Live server test: sglang with --forward-pass-metrics-port 20380 emits prefill/decode metrics per iteration
  • Full stack test with event plane consumer (requires NATS/etcd)

…elay

Extend DynamoSglangPublisher with init_fpm_relay() to subscribe to
sglang's new ForwardPassMetrics ZMQ PUB and relay to the Dynamo event
plane. Uses the same FpmEventRelay Rust binding as the vLLM integration.

Gated behind DYN_FORWARDPASS_METRIC_PORT env var. Creates one relay
per local DP rank, handles multi-node DP attention topology.
@github-actions
Copy link
Copy Markdown
Contributor

👋 Hi ishandhanani! Thank you for contributing to ai-dynamo/dynamo.

Just a reminder: The NVIDIA Test Github Validation CI runs an essential subset of the testing framework to quickly catch errors.Your PR reviewers may elect to test the changes comprehensively before approving your changes.

🚀

@github-actions github-actions bot added the feat label Mar 14, 2026
@github-actions github-actions bot added external-contribution Pull request is from an external contributor documentation Improvements or additions to documentation backend::sglang Relates to the sglang backend labels Mar 14, 2026
Map DYN_FORWARDPASS_METRIC_PORT env var to sglang's new
server_args.forward_pass_metrics_port in args.py. The publisher
reads from server_args instead of env var directly.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend::sglang Relates to the sglang backend documentation Improvements or additions to documentation external-contribution Pull request is from an external contributor feat size/L

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant